Foresight plus: serverless spatio-temporal traffic forecasting

Oakley, Joe; Conlan, Chris; Demirci, Gunduz Vehbi; Sfyridis, Alexandros; Ferhatosmanoglu, Hakan

doi:10.1007/s10707-024-00517-9

Foresight plus: serverless spatio-temporal traffic forecasting

Research
Open access
Published: 26 April 2024

(2024)
Cite this article

Download PDF

You have full access to this open access article

GeoInformatica Aims and scope Submit manuscript

Foresight plus: serverless spatio-temporal traffic forecasting

Download PDF

Joe Oakley¹,
Chris Conlan¹,
Gunduz Vehbi Demirci²,
Alexandros Sfyridis³ &
…
Hakan Ferhatosmanoglu¹^na1

425 Accesses
Explore all metrics

Abstract

Building a real-time spatio-temporal forecasting system is a challenging problem with many practical applications such as traffic and road network management. Most forecasting research focuses on achieving (often marginal) improvements in evaluation metrics such as MAE/MAPE on static benchmark datasets, with less attention paid to building practical pipelines which achieve timely and accurate forecasts when the network is under heavy load. Transport authorities also need to leverage dynamic data sources such as roadworks and vehicle-level flow data, while also supporting ad-hoc inference workloads at low cost. Our cloud-based forecasting solution Foresight, developed in collaboration with Transport for the West Midlands (TfWM), is able to ingest, aggregate and process streamed traffic data, enhanced with dynamic vehicle-level flow and urban event information, to produce regularly scheduled forecasts with high accuracy. In this work, we extend Foresight with several novel enhancements, into a new system which we term Foresight Plus. New features include an efficient method for extending the forecasting scale, enabling predictions further into the future. We also augment the inference architecture with a new, fully serverless design which offers a more cost-effective solution and which seamlessly handles sporadic inference workloads over multiple forecasting scales. We observe that Graph Neural Network (GNN) forecasting models are robust to extensions of the forecasting scale, achieving consistent performance up to 48 hours ahead. This is in contrast to the 1 hour forecasting periods popularly considered in this context. Further, our serverless inference solution is shown to be more cost-effective than provisioned alternatives in corresponding use-cases. We identify the optimal memory configuration of serverless resources to achieve an attractive cost-to-performance ratio.

Last-mile delivery concepts: a survey from an operational research perspective

Article Open access 21 September 2020

Big Data Analytics in Weather Forecasting: A Systematic Review

Article 28 June 2021

CloudAIBus: a testbed for AI based cloud computing environments

Article 06 June 2024

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Traffic data collected at roadside sensors can offer significant value to transport managers. The raw data is typically transformed into a time series format, capturing metrics such as the vehicle count or average speed over the road network. This information can be used to make forecasts about the state of the road network in the near future, which can enable proactive responses when heavy or unusual load on the network is predicted.

A wide range of forecasting approaches have been applied to the traffic prediction task, from statistical methods such as ARIMA [2, 17, 32], to Deep Learning (DL) models such as LSTM [20]. In recent years, Graph Neural Network (GNN) approaches have achieved state-of-the-art results, due to their ability to capture spatial dependencies between sensors [19, 27, 30, 34, 36, 51, 62, 68]. GNNs typically model the road sensor network as a graph structure, whose weighted adjacency matrix reflects the strength of inter-sensor relationships.

Despite an extensive body of research into the traffic forecasting problem, there are still several challenges to overcome when building practical solutions. First, there are operational requirements around the scalable handling and pre-processing of the streaming traffic data, to enable its use for real-time forecasting. Typically, forecasting models are developed for offline use, and do not consider the challenges of producing forecasts on streaming data. The real-time forecasting problem requires that the prediction process takes place continuously within a given time lag of each real-world traffic event occurring (i.e., vehicles passing a sensor). This is an important problem to resolve for practical data-driven systems, as transport managers need to be able to take action based on responsive short-term forecasts. It has also been identified as an open research issue, and entails significant data management challenges, particularly when DL models are employed [9]. Furthermore, existing research has largely considered forecasting 1 hour ahead, mostly on static data, with only a few works seeking to predict further ahead [6, 67]. It would be beneficial to transport managers if accurate forecasts could be made further into the future, allowing more time for responsive action to be taken. Next, in addition to data captured at roadside sensors, dynamic urban events (DUE) and vehicle-level flow data should also be incorporated into forecasting models to improve predictive performance. Finally, many inference workloads are sporadic in nature, with queries arriving at irregular intervals and being distributed over multiple forecasting models/scales. Whilst an inference platform built using always-available virtual machine (VM) resources may not be a cost-effective solution for such workloads, there are challenges associated with achieving efficient inference performance on pay-as-you-go alternatives such as serverless computing. These include provider-imposed restrictions on vCPU, memory, and maximum function runtime [49].

The Foresight cloud-based forecasting system [12] achieves real-time forecasting over urban traffic data, and effectively leverages DUE and vehicle-level flow data to improve predictive performance. In this work, we extend Foresight with several novel enhancements; we term the improved system Foresight Plus.^{Footnote 1}

First, we present an approach for extending the forecasting scale in Foresight Plus beyond the ‘1 hour ahead’ horizons typically seen in this domain. Our extended solution provides efficient inference while enabling forecasting many hours into the future, with little to no degradation in the quality of predictions. Further, we design a fully serverless inference solution for traffic forecasting. Foresight Plus is more cost-effective than a provisioned inference solution for many workloads, and seamlessly handles requests over multiple predictive models with an attractive cost-to-performance ratio.

The contributions of this work are as follows:

We present Foresight Plus, a cloud-based real-time traffic forecasting system which extends the forecasting scales catered for in Foresight, efficiently enabling predictions further into the future.
We design a fully serverless inference solution which handles sporadic inference workloads. We also present a cost model for serverless forecasting, and consider the implications of several design choices in this context.
We observe that GNN forecasting models are robust to extensions of the forecasting scale up to 48 hours ahead, and can achieve improved performance as the scale grows.
We identify fully serverless inference as a cost-effective and efficient solution for sporadic inference workloads. We study the scalability, cost and performance characteristics of serverless offerings to optimize resource configurations.

The rest of the paper is organized as follows. Section 2 presents related work. Section 3 formalizes the traffic forecasting problem, and describes its real-time extensions. Section 4 illustrates Dynamic Urban Events, before Section 5 describes the Flow-based GNN Adjacency Matrix. Section 6 presents our approach for extending the forecasting scale. Section 7 then describes our fully serverless inference solution for sporadic workloads. The Foresight Plus system architecture is illustrated in Section 8. Section 9 covers our experimental analysis, both of Foresight and Foresight Plus. Finally, Section 10 concludes the paper.

2 Related work

Identifying the future state of a system via forecasting has been applied in a wide range of disciplines including economics [31], energy and environmental studies [3, 4, 23, 41], epidemiology [22, 53], crowd flow prediction [26] and transport management [13, 16, 47, 59, 65].

Various systems and models have been utilized to achieve valuable predictive outcomes such as accurate traffic predictions on the road network. The Autoregressive Integrated Moving Average (ARIMA) and its variations have been consistently popular time-series models [2, 17, 32]. Machine Learning (ML) approaches have also been applied, with the Support Vector Machine (SVM) [63, 71], XGBoost [15] and the Random Forest [5, 43, 63] being the most commonly used. Deep Learning (DL) solutions based on Artificial Neural Networks have been increasingly utilized due to their improved forecasting accuracy and the ability to account for non-linear dependencies [25, 33, 38, 54, 55]. Long Short-Term Memory (LSTM) and Feed Forward Neural Networks (FFNN) are among the popular models applied to forecast traffic flows [20, 37, 39, 56], with several hybrid approaches also investigated [57, 70]. Finally, Graph Neural Networks (GNNs), which can capture the spatial dependencies between the traffic monitoring sensors by representing the road network as a graph structure, have further improved prediction accuracy. Hence, multiple GNN applications for traffic flow forecasting have been presented in recent years [9, 19, 27,28,29,30, 34,35,36, 51, 62, 68].

2.1 Dynamic urban events

Urban events such as roadworks have been demonstrated to significantly impact traffic flow [7, 48]. Hence, the incorporation of auxiliary information about such events can further improve traffic forecasting performance. For example, roadwork and accident information has been utilized in traffic simulation systems and ML models [1, 8, 36]. A combination of roadworks and weather conditions have been added to a bi-directional LSTM Autoencoder for short-term traffic prediction [18].

2.2 GNN adjacency matrix

In GNN models, the underlying graph structures are usually represented with an adjacency matrix which captures the spatial relationships between the nodes of a graph [19]. Although GNN adjacency matrices are typically binary [21], multiple variations have been proposed [28]. For example, a real-valued distance-based adjacency matrix is a common alternative for representing the spatial dependencies between nodes, and has been applied in numerous traffic forecasting studies with GNNs [10, 45, 52, 60, 69]. The travel time between nodes has also been considered as an alternative to distance-based metrics [61]. More recently, dynamic matrices which capture changes in the spatial dependencies of the graph have been introduced [14]. Coarse origin-destination (OD) data has also been applied as a substitute for a distance-based adjacency matrix [64]. Our work instead leverages vehicle-level flow information, obtained from traffic cameras in the West Midlands, to realistically model the propagation of traffic through the network.

2.3 Forecasting systems

In addition to statistical and ML/DL modelling approaches, forecasting systems have also been developed as specialized tools for time series prediction and road network management. For example, the AutoAI for Time Series Forecasting (AUTOAI-TS) [50] automates the selection, training and optimization of forecasting models for a given dataset. DeepTRANS [58] combines the DeepTTE system [40] with DCRNN [34] for bus travel time estimation. The system uses archive information about bus and traffic flow from sensor data, and DCRNN is used to estimate traffic speed at buses’ locations. The TrafficStream forecasting system leverages GNNs and Continual Learning (CL) [11]. It constructs a sub-graph to capture network expansion, and constraints are applied on the current training model to integrate information from historical data. In contrast, Foresight Plus focuses on the design and evaluation of practical pipelines to satisfy real-time forecasting requirements, supporting extended predictive scales and using a fully serverless inference architecture.

3 Real-time spatio-temporal forecasting

In this section, we first describe the traffic forecasting problem, before introducing its real-time variant. A key requirement of this procedure is that the aggregation, pre-processing and inference of the traffic data must take place within a certain time period. These practical aspects of forecasting have attracted relatively little attention in the large body of research on the topic.

3.1 Traffic forecasting problem

We first present the definition of the traffic forecasting problem, where the goal is to predict the future state of the road network, given a sequence of previously observed time series readings. Traffic information is typically obtained from roadside sensors, which can capture features such as traffic counts or average speed, to form a (multivariate) time series. Given a set of sensors S, we denote the traffic information observed across all sensors as $\varvec{X} \in \mathbb {R}^{H \times |S| \times P}$, where H is the total number of historical traffic readings, and P is the number of predictive features used. Let $\varvec{X}^{(t)} \in \mathbb {R}^{|S| \times P}$ denote the traffic signal observed at time t, and $\varvec{Y}^{(t')} \in \mathbb {R}^{|S| \times Q}$ denote the traffic signal to be predicted at time $t'$. Note that the number of target features Q may be different to P. We aim to learn a function $f(\cdot )$ which maps from $T'$ historical traffic signals to T future traffic signals:

$$\begin{aligned}{}[\varvec{X}^{(t - T' + 1)} , \dots , \varvec{X}^{(t)}] \xrightarrow {f(\cdot )} [\varvec{Y}^{(t+1)}, \dots , \varvec{Y}^{(t+T)}] \end{aligned}$$

(1)

3.2 Real-time forecasting

The real-time variant of the traffic forecasting problem adds the constraint that all processing takes place within a specified duration following the end of each time bin. Performing real-time forecasting, particularly with DL models, has been identified as a significant challenge [9]. In Foresight Plus, anonymized streaming traffic data is collected at road cameras and ingested into the platform via an API endpoint. Further details of this procedure are illustrated in Section 8.1.

The real-time forecasting routine begins at the end of each time bin, which are each B minutes long. First, the raw vehicle-level data (held in cloud storage) is aggregated for the most recent time bin (i.e., the B minutes from $\varvec{X}^{(t-1)}$ to $\varvec{X}^{(t)}$). This aggregation entails iterating over all individual vehicle captures that arrived during the time bin, and summing the counts for each vehicle class (e.g., petrol car, HGV), camera and lane. Note that while the vehicle class information is not required for our forecasts, it can be utilized for other data analysis tasks. We denote the time taken for this aggregation as $T_{Agg}$. Next, the aggregated data is pre-processed so that it is appropriately formatted for model inference. This includes fetching and processing the aggregated traffic count information for the last $T'$ time bins, as well as retrieving any additional model-specific data used for inference (e.g., roadwork time series, adjacency matrix). The time taken for this phase is referred to as $T_{PreProc}$. Once the required data have been produced, the inference API endpoint is invoked to perform the forecast. The time taken for inference processing to occur, as in (1), is denoted by $T_{Inf}$.

We require the following expression to be satisfied for a system to be capable of real-time forecasting:

$$\begin{aligned} T_{Total} = T_{Agg} + T_{PreProc} + T_{Inf} \le B \end{aligned}$$

(2)

A value of $T_{Total} \le B$ ensures that the shortest forecasting horizon still pertains to information that is yet to be aggregated in the system, and is therefore relevant to network managers.

4 Dynamic urban events

DUE data, in our use case, can be defined as any information which may impact the performance of a traffic prediction solution, outside and apart from the metrics gathered by sensors in the network itself. There are many types of such data which may impact real-world outcomes, thus impacting the reliability of the predictive model. Scheduled or unscheduled roadworks can slow or impede traffic flows; the same applies to traffic accidents. Events at venues linked to the network can change traffic levels and flows. Weather conditions can impact driving speeds and the likelihood of accidents. Delays/cancellations/industrial action affecting other modes of transport can have a major impact on traffic volumes on the road network. School/public holidays are clearly impactful. Foresight is able to leverage DUE data to improve the accuracy of its forecasts. We use roadworks data as an illustrative example, but other such information (e.g., social event data) could readily be applied in a similar fashion. In the context of traffic forecasting, planned and unplanned roadworks frequently influence the volume and nature of traffic propagation through the road network [7, 48], and so incorporating roadwork schedules into predictive models is intuitively helpful for accurate predictions. Foresight automatically ingests DUE data and processes it into a format which forecasting models can easily exploit.

Roadworks data is ingested into Foresight via the Street Manager API^{Footnote 2}, which is invoked to receive a feed of planned roadwork events. We denote the set of all roadworks listed by a given API call as R. For each roadwork $r \in R$, we obtain its latitude/longitude, as well as its start and end dates $T_s$ and $T_e$. In order to associate the live roadworks on a given day T with the road sensor network S, we first select only those roadworks where $T_s \le T \le T_e$. Next, we calculate the road network distance (using an indicative driving speed over a shortest path calculation on the road network) between each $r \in R$ and each $s \in S$. These distances populate an $|R| * |S|$ matrix $\varvec{W}$, with each entry (i, j) denoting the road network distance from live roadwork i to traffic sensor j in the network.

To incorporate this roadwork-to-camera influence information into the forecasting models, we convert $\varvec{W}$ into a time series format at the same temporal granularity as the observed traffic data. This has been shown to be an effective method for adding roadwork data to forecasting models [36]. We define this as a new feature set $\varvec{\hat{X}} \in \mathbb {R}^{H \times |S|}$. Each entry $\varvec{\hat{x_i}} \in \varvec{\hat{X}}^{(t)}$ has a value between 0 and 1 which denotes the strength of the influence of the nearest active roadwork to sensor i at time t. We consider two approaches to approximate this influence. The first is a binary thresholding approach, where entries are activated if there is a roadwork within threshold distance d metres of the sensor. The second method involves first calculating the distance from each sensor to its nearest live roadwork, before normalizing these distances into [0, 1]. We perform this normalization using a thresholded Gaussian kernel, with threshold k.

Combining $\varvec{X}$ and $\varvec{\hat{X}}$, a new matrix $\varvec{\tilde{X}} = \begin{bmatrix} \varvec{X}\\ \varvec{\hat{X}} \end{bmatrix}$ is constructed, which is the new feature vector passed to the forecasting models. We evaluate these approaches within the context of a GNN model in Section 9.

5 Flow aggregated adjacency matrix

Graph Neural Networks (GNNs) are popularly used in state-of-the-art forecasting models [10, 14, 27, 30, 34, 35, 45, 51, 52, 60, 62, 69]. These methods typically represent the traffic sensor network as a graph structure, whose adjacency matrix aims to capture spatial relationships between the sensors. The objective of GNN message passing and node aggregation approaches in the context of traffic forecasting, such as diffusion convolution [34], is to simulate traffic propagation in the network. This method of extracting features is typically embedded into a wider learning structure so that temporal features can be learnt along with spatial features in an integrated fashion.

The graph structure which models the traffic sensor network is described by an ${|S|} \times {|S|}$ (weighted) adjacency matrix. The value at position (i, j) approximates the strength of the relationship between sensor $s_i$ and sensor $s_j$. A popular method to assign weights in the adjacency matrix is to calculate pairwise sensor distances measured in the road network [34, 60, 62].

The aim of our approach is to more realistically reflect the actual flow of traffic in the network, compared to coarse sensor separation measures such as Euclidean distance. Simple distance-based measures alone are insufficient, as sensor separation per-se does not necessarily indicate traffic flow levels. Even though two sensors are spatially co-located, traffic might rarely pass between them consecutively, or may flow in one direction significantly more than the other; these properties cannot be easily captured by this approach.

We therefore develop a method for computing the adjacency matrix weights which uses vehicle-level flow data to more accurately determine the relationships between sensors. By leveraging the properties of granular ANPR (Automatic Number Plate Recognition) data, our method can capture (in order) the sequence of sensors which (anonymized) vehicles pass as they traverse the road network. By aggregating this information, we are able to determine actual flows within the network. The new adjacency matrix is designed to retain the same dimensions used in most GNN methods for spatio-temporal forecasting, so it can be directly applicable within these models.

The Flow Aggregated Adjacency Matrix (FAAM), denoted as $\varvec{F} \in \mathbb {R}^{\left| S\right| \times \left| S\right| }$, is constructed by aggregating observed flow between cameras within a given time frame. 1 unit of flow is recorded between sensor $s_i \in S$ and $s_j \in S$ when a vehicle is observed at $s_i$ at time t, and is then next observed at $s_j$ no later than $t + \tau $, where $\tau $ is a parameter given in seconds which denotes the acceptable transition period. Note that we operate on a network of urban roadside sensors, rather than on the underlying road network. Therefore, one can not claim that the vehicle has not traversed any other roads between $s_i$ and $s_j$, but rather that it has not passed any other sensors. Indeed, if the vehicle had passed sensor $s_k$ between $s_i$ and $s_j$, we would record $s_i \rightarrow s_k$, and then $s_k \rightarrow s_j$ (as two separate units of flow). This is conceptually different to an origin-destination (OD) approach. To construct $\varvec{F}$, each entry $\varvec{F_{i,j}}$ is incremented by 1 for each observed unit of flow. $\varvec{F_{i,j}}$ is then averaged over all the time periods during which flow was observed, before being normalized into [0, 1]. Each entry $\varvec{F_{i,j}}$ thus approximates the likelihood of a vehicle transitioning directly from $s_i$ to $s_j$ within transition period $\tau $. This can be periodically updated to reflect changes in the network over time, such as seasonality. We note that a more granular time scale would be possible in this formulation, e.g., to capture shifting traffic patterns throughout the day, but we leave this to be explored in future work.

We select a value for $\tau $ based on a review of typical elapsed travel times between the most separated sensors in terms of road network distance, with some overhead added to capture outliers. Setting $\tau $ on a global basis was done for simplicity in this case, but there are alternative approaches which could be explored. A $\tau $ value could be allocated per sensor-pair, based on the road distance between them, or on previous vehicle flow times across them. The latter would take account of factors other than distance which may impact traversal time - for example the presence of service/refuelling stations. The $\tau $ value could even be learned using a separate ML model. Mechanisms for obtaining optimal $\tau $ settings would be an interesting area for further study.

6 Foresight plus: extending the forecasting scale

The majority of DL-based traffic forecasting models are evaluated on a small number of benchmark datasets such as METR-LA [24], and seek to forecast a maximum of 1 hour ahead [9, 27, 30, 34, 36, 51, 68]. While this is useful in short-term forecasting applications, it would be advantageous to extend the forecasting scale further into the future. The ability to forecast traffic patterns multiple hours ahead would provide more time for transport network managers to perform interventions such as traffic re-routing.

Whilst the short-term (up to 1hr) prediction problem is now well-studied and recent solutions based on GNNs and Transformers continue to set benchmarks, there is a research gap around models capable of longer-term predictions [72]. Where existing solutions have been applied to this problem, the standard approach has been to produce short-term predictions and to use these as ‘ground truth’ to create the next set of short-term predictions, repeatedly. This presents two main challenges. First, errors are propagated forward and accumulated at each phase. Secondly, computational complexity (and cost) becomes an issue. It has been estimated that following the above approach, using 7 days of historical data and predicting 12 hours ahead, the training time for DCRNN (for METR-LA on a single NVIDIA TITAN RTX 16 GB GPU) would be over 7,000 hours [72]. Clearly a more efficient approach is required, and new studies which predict further into the future without using predictions as ground truth are now appearing [66].

We develop a lightweight approach using the aggregation of multiple historical timesteps to enable longer-term forward predictions in a single pass. We employ a GNN forecasting model but do not rely on the use of predictions as ground truth. The approach performs well as confirmed by our experimental analysis. In light of this, we extend the architecture of the existing Foresight system to offer multiple forecasting scales, henceforth denoted k. In particular, we cater for traffic forecasts multiple hours into the future (up to $k=1,3,6,12,24,36$ or 48 hours ahead at present). The user is able to dynamically select from the supported scales at inference time with no additional provisioning time/costs. Foresight Plus forecasts further into the future than several previous approaches [6, 67], which have considered forecasts up to 4 and 10 hours ahead, respectively.

Whilst we acknowledge that (absolute) predictive errors will naturally increase as forecasts extend further into the future, we show that this happens in a stable manner. We present an efficient method to enable DL-based models to forecast at multiple scales. For the original 1 hour ahead forecasts generated by Foresight, the underlying data was aggregated into B-minute bins, as discussed in Section 3. In Foresight Plus, we introduce an efficient aggregation and upscaling pipeline. This procedure is abstracted away from the forecasting models, which still execute the same prediction function $f(\cdot )$ illustrated in Section 3.1; $T'$ historical traffic signals are used to forecast T future signals.

Recall that for a set of sensors S, the entire historical traffic data is denoted as $\varvec{X}$ $\in $ $\mathbb {R}^{H \times |S| \times P}$, where H is the total number of historical B-minute bins. To upscale the traffic data to a new scale k, we sum each group of k rows of $\varvec{X}$. This results in a new historical time series $\varvec{X_k} \in \mathbb {R}^{\frac{H}{k} \times |S| \times P}$. We then train bespoke forecasting models on each of the upscaled datasets (requiring only minimal adjustments to the models themselves). To process an inference query at scale k, data from the appropriate ($k \times T'$) B-minute bins are first collected. These can be efficiently upscaled as described above, to produce inference input data $\varvec{X_k'} \in \mathbb {R}^{T' \times |S| \times P}$. $\varvec{X_k'}$ is then sent to the appropriate inference endpoint as input; further details of the Foresight Plus inference workflow are given in Section 7. We evaluate the effectiveness of this approach in Section 9.4.

7 Foresight plus: cost-effective serverless inference for sporadic workloads

The original Foresight architecture, which was initially designed with short-term forecasting in mind, provides a pipelined MLOps solution for regularly scheduled predictions. This satisfies the operational requirements of Transport for the West Midlands (TfWM), and integrates with existing traffic reporting systems. However, spatio-temporal forecasting requests that are spread over multiple scales may be more ad-hoc in nature (for example, in response to unusual traffic/cultural events). In scenarios such as this, inference requests are often triggered manually (e.g., via mobile apps) and arrive in a sporadic fashion. While provisioned inference solutions are well-suited to handling frequent and regularly scheduled predictions, they may not be cost-effective for sporadic workloads due to low utilization. Further, several endpoint instances may be required to handle bursty traffic and/or multiple models (for different forecasting scales).

With this in mind, we extend Foresight’s MLOps suite in Foresight Plus by adding a fully serverless inference solution across multiple models/scales. Upon the receipt of an inference request at a given scale (as described in Section 6), a lightweight serverless instance (i.e., an AWS Lambda^{Footnote 3} function invocation) performs the necessary data extraction, upscaling and pre-processing, before invoking a further serverless inference endpoint. We maintain a unique serverless inference endpoint for each forecasting scale. These incur no cost when not in use (as is also the case with the aforementioned serverless pre-processing instance), and can rapidly scale to accommodate parallel requests. A serverless inference solution can be significantly more cost-effective than a provisioned alternative for many workloads [44].

7.1 Serverless inference cost model

We now formalize the cost model for fully serverless ML inference. This procedure consists of a pre-processing phase, followed by the invocation of a serverless inference endpoint:

$$\begin{aligned} C_{Total} = C_{PreProc} + C_{Inf} \end{aligned}$$

(3)

Both stages run on lightweight AWS Lambda Function-as-a-Service (FaaS) instances.

We first consider the detailed cost model for the pre-processing phase. $C_{PreProc}$ consists of the expenses incurred by running the FaaS instance, as well as those of the corresponding requests to object storage to fetch the necessary data. It is defined as follows:

$$\begin{aligned} C_{PreProc} = C_{\lambda (PreProc)} + C_{S3(PreProc)}\end{aligned}$$

(4)

$$\begin{aligned} C_{\lambda (PreProc)} = C_{\lambda (Inv)} + T_{PreProc}M_{PreProc}C_{\lambda (Run)}\end{aligned}$$

(5)

$$\begin{aligned} C_{S3(PreProc)} = LC_{S3(List)} + GC_{S3(Get)} \end{aligned}$$

(6)

Where $C_{\lambda (Inv)}$ is the cost of invoking a single FaaS instance, $T_{PreProc}$ is the duration of pre-processing (i.e., the runtime of the FaaS invocation), $M_{PreProc}$ is the memory assigned to the instance (in MB), and $C_{\lambda (Run)}$ is the cost per MB-second of FaaS runtime. Note that increasing the amount of AWS Lambda memory leads to a larger vCPU allocation, introducing a cost-to-performance trade-off which we examine in Sections 9.4.2 and 9.4.3. L and G correspond to the number of required object storage LIST and GET operations, respectively, with $C_{S3(List)}$ and $C_{S3(Get)}$ representing their costs. In Foresight Plus, we leverage the metadata provided in object storage solutions to filter out redundant files, hence minimizing the number of LIST and GET requests.

Next, we define the cost model for inference $C_{Inf}$ as follows:

$$\begin{aligned} C_{Inf} = T_{Inf}M_{Inf}C_{ServInf} + YC_{Byte(In)} + ZC_{Byte(Out)} \end{aligned}$$

(7)

Similarly to above, $T_{Inf}$ and $M_{Inf}$ reflect the serverless inference instance runtime and memory allocation, respectively. The running cost of the AWS Lambda worker used for SageMaker inference, $C_{ServInf}$, is $\sim 19\%$ more expensive per MB-second than a regular FaaS instance at present^{Footnote 4},^{Footnote 5}. The final two terms represent the costs of data transfer in/out of the serverless inference instance. As described in Section 3, this has dimensionality $T \times |S|$, so is the same for all scales.

The standard alternative to a serverless inference solution is to permanently provision one or more inference endpoints (hosted on VM instances). Such a solution can incur high passive running costs over time, while achieving poor resource utilization under sporadic workloads. We evaluate the cost savings of our fully serverless inference solution in Section 9.4.

7.2 Enhanced prediction flexibility

The Foresight system was able to produce pre-defined outputs at regularly scheduled intervals. These outputs can be used for the purposes of implementing timely interventions in the likelihood of unusual network conditions. Foresight Plus, on the other hand, can provide a more ad-hoc predictive capability, which apart from offering multiple extended forecasting scales, can address different use-cases. If, for example, a road traffic accident occurs in the network, a network manager may wish to envisage the likely impact on traffic flows a few hours ahead. Mobile applications could trigger such inference requests off-site. Roadwork operators may wish to submit ‘what-if’ inference requests to determine optimal periods (in terms of reducing disruption) for such works to take place. Foresight Plus addresses these requirements, supplementing the regularly scheduled inference outputs of the Foresight system.

7.3 Serverless architecture

The ad-hoc nature of the typical Foresight Plus use-case is an ideal fit for a cloud-based serverless architecture. Serverless computing is a resource delivery model in which the cloud provider is responsible for the provision and management of the underlying infrastructure and services. The attractive properties of serverless computing include elasticity, high availability, and cost-effectiveness with granular billing. In particular, users are only charged while resources are being used. In recent years, serverless computing has been successfully applied for machine learning (ML) inference, in situations where ML models are suitably sized and have achievable latency/throughput requirements on such platforms. Whilst the Foresight system uses pre-provisioned AWS Sagemaker endpoints (servers) which incur continuous costs, Foresight Plus benefits from the serverless inference approach using lightweight Function-as-a-Service (FaaS) compute instances (including via AWS Sagemaker Serverless Inference). Although the serverless architecture brings certain challenges, including restricted compute/memory capacities and (in some situations) a short ‘cold-start’ delay when processing requests [49], it is well-suited for the Foresight Plus usage scenario.

8 Foresight plus system architecture

In this section we present an overview of the Foresight Plus system architecture, as illustrated in Fig. 1. Foresight Plus was built in collaboration with our partners at TfWM, and extended their existing AWS platform. With this in mind, we chose to use AWS services throughout the architecture. It should be stressed that our methods could be readily implemented on any comparable cloud platform. We will first describe how streaming traffic data is ingested and aggregated, before presenting the MLOps suite and fully serverless inference procedure. Details of how DUE data and flow information are processed are given in Sections 4 and 5 respectively. See Sections 6 and 7 for discussion of our extended forecasting scales and serverless inference, respectively.

8.1 Streaming data ingestion, aggregation and storage

Figure 2 illustrates the data ingestion, aggregation and storage pipeline of Foresight Plus. The primary data source is anonymized ANPR vehicle capture information in the West Midlands road network managed by TfWM. This data flows into the system using a POST request to an API endpoint, before being forwarded to a streaming ETL service (Kinesis Data Firehose^{Footnote 6}). Individual vehicle captures (including a timestamp, salted hash of vehicle registration, camera/lane of observation and vehicle type) are buffered using this service, and are periodically flushed to object storage (once the buffer fills, or a short time period elapses). The buffered file is also converted to a columnar format (Apache Parquet^{Footnote 7}) for improved query performance.

We next use a serverless data integration offering (AWS Glue^{Footnote 8}), to periodically crawl (i.e., scan) the object storage buckets and catalogue these intermediate files. This enables the use of AWS Athena^{Footnote 9} to run serverless SQL queries over the columnar Parquet data. These queries generate aggregated traffic count data, illustrating the total number of vehicles of each type (e.g., petrol car, HGV) that have passed each roadside camera within the current time bin, i.e., the last B minutes. We use scheduling functionality in a cloud monitoring service (AWS CloudWatch^{Footnote 10}) to trigger the SQL processing (via lightweight serverless functions) for the current time bin. This procedure writes a single file to object storage (AWS S3^{Footnote 11}) per the current time bin, which can later be used as an input to ML workflows.

8.2 MLOps suite, training and inference

We leverage an AWS SageMaker^{Footnote 12} MLOps pipeline to create and deploy forecasting models. Data scientists can run experiments (e.g., in SageMaker Studio Notebooks) over data held in object storage, using standard libraries such as NumPy, PyTorch, TensorFlow, etc. Once a model has been developed, its source code can be pushed to one of two Git repositories (test, production) hosted in AWS CodeCommit^{Footnote 13}.

Repository updates then trigger the MLOps pipeline to provision a compute instance and perform the necessary pre-processing and training of the model. The MLOps pipeline can be configured to re-train the model periodically, e.g., once per week, to continually incorporate the latest traffic data.

The trained model is then deployed to a SageMaker inference endpoint. In Foresight, this used a VM instance, while in Foresight Plus, we employ a fully serverless inference architecture. Both inference solutions are illustrated in Fig. 1. In Foresight Plus, bespoke trained models for each forecasting scale (see Section 6) are first produced in the MLOps suite. We then deploy a SageMaker Serverless Inference (AWS Lambda) endpoint for each scale. This entails uploading the trained model artifacts and associated inference code to each Lambda container. Note that configuring these functions incurs no cost; serverless instances are only billed when they are invoked, and involve no passive costs over time. Our inference routine then proceeds in two stages. End users firstly submit HTTP inference requests to a pre-processing Lambda function, specifying the desired forecasting scale. This function fetches the necessary inference input data from object storage (e.g., the last 3 hours of traffic data for $k = 3$), and performs the necessary pre-processing. It then forwards the processed inference data to the relevant SageMaker Serverless Inference endpoint. Once inference has been completed, results are returned to the user.

9 Experimental analysis

9.1 Dataset and experimental setup

In this section we present the results of our experiments which test the effectiveness of popular traffic forecasting methods in a new setting. We then evaluate the impact of incorporating DUE data as an additional dimension to the input feature vector. We also consider the performance impact of using the FAAM, in place of a distance-based adjacency matrix, in a GNN forecasting model. After exploring the error profiles of our models, and their efficiency within Foresight, we evaluate the enhancements made in Foresight Plus. In particular, we assess forecasting performance at longer scales, as well as the cost-effectiveness and performance of our fully serverless solution (compared against non-serverless alternatives).

9.1.1 Road camera dataset

The anonymized and aggregated data used for the experiments is from a set of ANPR cameras in the West Midlands region of the UK, covering several large conurbations including Birmingham and Coventry. The precise locations of cameras remain private. Cameras are located along a variety of different route types, including busy interconnections, inner city streets and suburban/rural roads etc. This is different to many prior datasets, such as METR-LA [24], where road sensors are typically located on freeways where one can expect a high volume of free-flowing traffic. The quality of our data is high; the rate of missingness is only 2.3%, compared with 8.1% for METR-LA. We use linear interpolation to impute these missing values.

9.1.2 Experimental setup

Unless stated otherwise, the vehicle count data used in the following experiments was collected between August 5th and December 5th 2021 (inclusive), and was aggregated at 15 minute intervals. DUE data was collected for the same period. The flow was measured between August and November 2021 in order to compute the FAAM. Experiments on Foresight Plus are described in Section 9.4; the dataset used for these experiments was collected over a different date range.

The data was split into training, validation and test sets in a 70/10/20 ratio. We evaluate performance using mean absolute error (MAE) and mean absolute percentage error (MAPE). We also calculate the error distribution’s coefficient of variation, which we refer to as the error coefficient of variation (ECV). We refer to the set of absolute errors across all test samples as $\mathcal {E}$, and hence $ECV = \frac{\sigma (\mathcal {E})}{\mu (\mathcal {E})}$. The ECV allows us to compare the dispersion of the error terms across different distributions (i.e., the sets of errors made by different models), as it normalizes by the mean error. A high ECV indicates that predictions are inconsistent.

We evaluate the results firstly over all time periods in the test data, which we refer to as ‘Any Time’ (AT) experiments. We also perform evaluation focusing only on ‘Peak Times’ (PT). We identify peak times as those that have historically shown high average traffic counts, but also high levels of variability. High average traffic counts indicate heavy load on the network, which we assume are periods of interest for transport managers. High levels of variability are a sign of challenging forecasting conditions, and may denote periods of unusual traffic conditions on the network. We identify these periods of interest by first dividing the dataset into weekends and weekdays, and then further splitting each of these into hourly subsets. The mean and coefficient of variation of each subset is then calculated. Any of these subsets with both mean and coefficient of variation in the upper two quartiles is classified as peak time. The only time periods which satisfy this are 7am-8am, and 8am-9am on weekdays, hence we select these as our peak times. This selection also conforms closely to the domain knowledge of our partners at TfWM.

9.2 Forecasting models

As discussed in Section 2, numerous methods have been applied to the traffic prediction problem over many years. Our objective is not to identify the current state of the art in this domain, but rather to select a leading, popular GNN-based solution, and to adapt and enhance its performance using our methods and the real-world data available via our partnership with TfWM. DCRNN [34] is such a solution and is widely referenced in surveys of leading traffic prediction models [25, 29, 55]. As comparative baselines, we select a small number from the plethora of alternative solutions, with representatives from some of the common categories of prediction models. One of our selection criteria was to find solutions which had been implemented on the METR-LA [24]/PEMS-BAY datasets, as these are comparable to the traffic count data available to us for this study.

The following forecasting models have been evaluated on the road camera dataset.

Historical Average (HA): We produce a historical average matrix based on the training set. The average reading over the training set is calculated at each sensor in S for each of the 672 (4x24x7) weekly time steps. To perform inference, we give the historical average value of the target time period as our prediction (the notion of $T'$ historical traffic signals is not applicable to this method).
ARIMA: We iterate over all sensors and all test examples. In each iteration, we train an ARIMA model^{Footnote 14}, using the previous $T' = 100$ values as the training input.
Feed Forward Neural Network (FFNN): We implement an FFNN, where the input consists of the previous $T'$ readings across all sensors $s \in S$. The model produces predictions for the next T forecasting horizons. The network is constructed with two hidden linear layers, with ReLU activation functions. Model parameters are learned using backpropagation, with an L1 loss function.
Long Short Term Memory (LSTM): This is implemented similarly to FFNN, except using LSTM layers in place of linear layers. Within the LSTM layers, input data is treated as a sequence and temporal patterns are learnt using an additional hidden layer to capture the cell state, which passes information along the sequence.
Diffusion Convolutional Recurrent Neural Network (DCRNN-Base): We select DCRNN [34] as an illustrative example of an effective GNN method. This method has been previously identified as one of the best-performing (GNN) approaches for traffic forecasting on benchmark datasets [9]. The model utilizes a distance-based adjacency matrix to model the spatial relationships between road sensors, and employs diffusion convolution and bidirectional random walks to simulate traffic propagation in the network. We utilize the PyTorch implementation of DCRNN [34]. Note that we term this DCRNN-Base to avoid confusion with the variants below.
DCRNN-RW-T / DCRNN-RW-G: DCRNN with DUE adaption to include roadwork data. DCRNN-RW-T associates live roadworks to all sensors within a 1000m distance threshold. DCRNN-RW-G uses thresholded Gaussian kernel normalization (threshold $k = 0.1$).
DCRNN-F: DCRNN with the FAAM representing the underlying graph structure. An acceptable transition period $\tau $ between sensors is given as 3600 seconds and thresholded Gaussian kernel normalization ($k = 0.1$) is applied on the matrix.
DCRNN-RW-F: DCRNN with roadworks (using Gaussian kernel method) and FAAM.

All models are implemented in AWS SageMaker Studio using Python 3.6, on a ml.g4dn.xlarge instance. We use PyTorch 1.8 to implement FFNN, LSTM, and DCRNN (including all variants). Unless stated otherwise, $T' = T = 4$, and $B = 15$ minutes. In practice, we make predictions over horizons of 15, 30, 45 and 60 minutes (henceforth referred to as 15m, 30m, 45m, 60m). Note that we make predictions over all horizons simultaneously, i.e. the model does not gain information about the observed time series at $t+1$ when predicting for $t+2$. The more distant forecasting horizons (i.e., 45m, 60m) offer transport managers more time to implement pre-emptive interventions on the road network. Hence, performance gains here are particularly valuable.

Table 1 Table of results for all time periods (AT)

Full size table

Table 2 Table of results for peak time (PT) periods only

Full size table

9.3 Experimental results: foresight

We describe the key findings from our experimental results, which are presented in Tables 1 (AT) and 2 (PT). First, we compare the performance of several existing forecasting approaches in our new data setting. We then consider the impact of incorporating roadworks as an exogenous input feature, as well as using flow to determine edge weights in the adjacency matrix. Next, we discuss our findings pertaining to prediction reliability using ECV. Finally, we analyze the efficiency of Foresight.

9.3.1 Analysis of existing approaches

We first consider the performance of several existing spatio-temporal forecasting approaches in our new data setting. During the AT experiments it can be observed that DCRNN-Base makes more accurate predictions across all four time horizons (MAE improvements - 15m: 20.9%, 30m: 20.6%, 45m: 25.3%, 60m: 22.7%) compared to the next closest non-DCRNN model (LSTM), with the largest improvements seen at the longest forecasting horizons. Similarly, during the PT experiments DCRNN-Base remains the most accurate model compared to the non-DCRNN options. However, it is interesting to note that the improvements compared to LSTM are now much smaller (MAE improvements - 15m: 18.5%, 30m: 15.1%, 45m: 14.2%, 60m: 8.3%), and the trend at longer horizons is reversed where we see the smallest MAE improvements. ARIMA tends to be a competitive model for shorter horizons, during both PT and AT experiments, however the performance deteriorates quickly at longer forecasting horizons, which indicates that this model requires fresh data to support accurate predictions. HA and FFNN make the least accurate predictions across all forecasting horizons.

Different trends emerge when MAPE performance is considered. DCRNN-Base now exhibits poorer performance than LSTM across all forecasting horizons during the AT experiments (MAPE degradation - 15m: 22%, 30m: 25.8%, 45m: 21.3%, 60m: 11.3%); these discrepancies are further exacerbated in PT experiments (MAPE degradation - 15m: 37.7%, 30m: 36.6%, 45m: 41.2%, 60m: 18.3%). This is an interesting result as it suggests that while LSTM makes poorer predictions on average (i.e., MAE), it also makes fewer mistakes of a significant margin, leading to a lower MAPE (this metric is highly sensitive to outliers in the error term). It may therefore be inferred that LSTM is better than DCRNN-Base at predicting unusual traffic patterns, especially at peak times. In terms of MAPE, ARIMA was shown to be a highly competitive model across all forecasting horizons, outperforming DCRNN-Base in most cases, with more pronounced gains in PT experiments. As ARIMA is retrained on the most recent data when evaluating each test sample, it will naturally be more responsive to unusual traffic patterns than models trained using a conventional train/test split. LSTM still largely outperforms ARIMA in regards to MAPE. HA performs particularly poorly on this metric, due to its inability to dynamically respond to current network conditions.

9.3.2 DUE analysis

We also evaluate the impact of adding dynamic urban events to GNN models. Mixed results are achieved when MAE is considered. DCRNN-RW-G, which associates roadworks using a thresholded Gaussian kernel, generally yields higher MAE than DCRNN-Base across both AT and PT experiments. These discrepancies in MAE are particularly pronounced for long forecasting horizons during peak times (MAE degradation - 45m: 5.4%, 60m: 11.4%). On the other hand, DCRNN-RW-T (binary thresholding) achieves lower MAE compared to DCRNN-Base over all AT experiments. However, it still yields inferior performance at more distant forecasting horizons at peak times (MAE degradation - 45m: 1.4%, 60m: 2.1%).

The results for MAPE present a contrasting picture, where DCRNN-RW-G outperforms DCRNN-RW-T across all experiments. During AT experiments (especially at longer horizons), DCRNN-RW-G achieves significant improvements compared to DCRNN-Base (MAE improvements - 45m: 29.8%, 60m: 29.1%). We observe a similar pattern during peak times. These findings indicate that using a thresholded Gaussian kernel during the construction of a FAAM yields a reduction in large outlier errors (likely resulting in improved performance under unusual road network conditions).

9.3.3 FAAM analysis

Our experimental results indicate that using vehicle-level flow data to model inter-sensor relationships is an effective strategy. For AT experiments, DCRNN-F achieves lower MAE than DCRNN-Base across all time horizons, and is particularly effective at long forecasting horizons (MAE improvement - 60m: 3.2%). Further, we observe even larger MAE gains for DCRNN-F compared to DCRNN-Base at peak times (MAE improvement - 60m: 8.8%). These findings support the inclusion of vehicle-level flow data into GNN models for improved predictive performance.

We note that leveraging the FAAM in place of a distance-based adjacency matrix (i.e., DCRNN-Base) yields MAPE improvements in all cases. However, the most significant MAPE gains are still experienced by DCRNN-RW-G, indicating that incorporating roadworks is a more effective strategy for minimizing outlier errors. It should be noted that DCRNN-F mitigates much of the degradation in MAPE performance at peak times that DCRNN-Base suffers in comparison to LSTM, while also offering leading MAE results.

9.3.4 Error coefficient of variation

As illustrated in Tables 1 and 2, DL models, particularly those which have been enhanced by DUE data or the FAAM, experience the highest ECV (especially at longer forecasting horizons). As shown above, it is at these more distant horizons that the biggest performance improvements (MAE/MAPE) are observed for our augmented models. This suggests that while these solutions produce the best forecasts on average, their errors are the least consistent. This finding is noteworthy, and we would recommend further investigation to better understand its implications.

9.3.5 Efficiency analysis

As discussed in Section 3.2, the real-time forecasting task requires that $T_{Total} = T_{Agg} + T_{PreProc} + T_{Inf} \le B$. In the current version of Foresight, we allow for $T_{Agg} \le 40$ seconds. For all of the implemented forecasting models, $T_{PreProc} \le 6$ seconds. Each model except ARIMA achieves $T_{Inf} \le 2$ seconds. As discussed above, at inference time we train an ARIMA model over the previous $T' = 100$ values for each $s \in S$. For ARIMA, $T_{Inf} \le 16$ seconds. Hence, all of the presented models achieve $T_{Total} \approx 1$ minute, satisfying (2) with significant headroom for $B = 15$ minutes. Further, these results conform to alternative notions of real-time forecasting [42], where predictions were produced in a single-digit order of minutes.

9.4 Experimental results: foresight plus

9.4.1 Extending the forecasting scale

We first evaluate the resilience of DL-based forecasting models to the extension of the forecasting scale further into the future, as well as the impact of a larger historical traffic time series. As discussed in Section 6, Foresight Plus offers 7 different forecasting scales; namely $k = 1, 3, 6, 12, 24, 36, 48$ hours ahead. Each still forecasts 4 horizons into the future, with the time period (B minutes) covered by each horizon expanding (uniformly) with the scale. For instance, the ‘1hr ahead’ scale includes forecasting horizons 15mins, 30mins, 45mins and 60mins into the future, while the horizons for ‘12hrs ahead’ reflect 3hrs, 6hrs, 9hrs and 12hrs ahead. The forecasting scales, together with their corresponding horizons, are presented in Table 3.

Table 3 Table illustrating the time periods encapsulated by each horizon, for all seven forecasting scales ($k = 1, 3, 6, 12, 24, 36, 48$).

Full size table

We plot MAE and MAPE results in Figs. 3 and 4, respectively. Our experimental findings for the ‘1hr ahead’ scale indicate that DL/GNN-based forecasting models such as DCRNN benefit significantly from a larger historical dataset of traffic signals. Expanding from 4 to 18 months of ANPR readings leads to a reduction in MAPE from $\sim 27$-$40\%$ to $\sim 17$-$24\%$ (over the 4 horizons for $k=1)$. Beyond the obvious benefit of learning from a greater number of training samples, the fact that the model has observed data from each month in the calendar year allows it to better capture the seasonality of traffic patterns.

Next, considering MAE over all scales, we see that errors increase in an approximately linear fashion as the forecasting scale grows. This is expected, since as k increases, so does the overall volume of traffic being predicted (note that we forecast traffic count, rather than speed), as well as the likelihood of an unusual traffic event influencing predictions. Forecasting errors increase over the 4 horizons in a steady fashion over all scales, apart from a slightly increased error for horizon 2 at $k=36$.

Considering MAPE, we see that prediction errors remain stable as the forecasting scale expands significantly further into the future, and can even improve. We observe minimal degradation between $k = 1, 3, 6, 12$, outside of minor lapses in performance at $k = 12$ (horizon 1). Forecasting errors then significantly reduce for $k=24, 36, 48$. While the particularly low errors at these longer timescales may indicate the relative stability of daily traffic results, and/or the reduced impact of isolated unusual traffic events such as roadworks, it is nonetheless impressive to see such strong predictive accuracy when predicting much further into the future (using only a lightweight upscaling method). These results indicate that Foresight Plus is a reliable solution across all forecasting scales, highlighting its utility for a variety of use cases. It is interesting to observe the contrast between the errors at $k=24, 48$ and $k=36$. The slight increase in MAPE at $k=36$ may indicate that forecasting traffic at multiples of 24 hours is a special case which offers additional stability. Further work could investigate this effect in more detail.

9.4.2 Fully serverless real-time inference: cost

We now evaluate Foresight Plus’ fully serverless solution for real-time forecasting inference. In this section, we consider three variants of Foresight Plus, each with a different memory allocation. Namely, we configure serverless inference pipelines with 1GB, 3GB and 6GB memory (note that this applies to both $M_{PreProc}$ and $M_{Inf}$). These configurations were selected as 1GB/6GB are the current min/max memory allocation for AWS SageMaker Serverless Inference. Note that increasing the memory allocation to AWS Lambda instances (which both the pre-processing and inference stages are executed on) entails an increase in vCPU allocation and network bandwidth.

Our cost model only considers inference expenditure, as our work focuses on sporadic inference workloads across multiple scales, rather than minimizing the number/duration of training jobs. As an illustrative example of training overheads, our models incurred costs of $\sim $ $0.30 ($k = 48$) to $\sim $ $10 ($k = 1$) on an AWS ml.g4dn.xlarge instance^{Footnote 15}. These costs will be amortized by daily inference costs, particularly at large query volumes. While the models will periodically need to be re-trained, the frequency/duration of these jobs is beyond the scope of our work, and has been explored previously [46]. Further, the large date range covered by the ANPR dataset (18 months, vs 4 months for METR-LA) ensures that seasonality is captured by the model; this should reduce the frequency at which the models require re-training.

We first compare the cost-effectiveness of Foresight Plus to that of a standard inference solution; that is, a provisioned endpoint, running on VM instances (analogous to the original Foresight solution). In this experiment, we model the non-serverless solution as running with two AWS t2.medium instances (2 vCPU, 4GB memory). We allocate two instances to handle overlapping queries (which could exceed the memory of one instance), and to offer redundancy. To simulate the sporadic inference workloads which Foresight Plus targets, we model queries being received over a 24-hour period, which are uniformly distributed over the 7 forecasting scales ($k = 1, 3, 6, 12, 24, 36, 48$).

We also consider both cold and warm inference requests. Each AWS Lambda request runs in a dedicated container. Upon the receipt of a request, the AWS Lambda service attempts to allocate it to a container. If no containers are immediately available, one must be started; this incurs a short delay^{Footnote 16}, and is known as a ‘cold start’. Once a request has been completed, its container remains warm for $\sim 15$ minutes. If another request is received during this period, it can be rapidly assigned to the warm container (this effect is termed ‘container re-use’), hence avoiding the cold start delay. It should be noted that container re-use only applies to instances of the same Lambda function, so consecutive requests relating to different forecasting scales will not benefit from this effect. However, all inference requests initially invoke a shared pre-processing function, which will benefit from container re-use for all queries. In this experiment, we assume a 50/50 split of cold/warm inference requests.

On the other hand, the provisioned endpoint solution will have a fixed cost regardless of utilisation, as VM instances are simply billed at an hourly rate. In Fig. 5, we observe that each serverless variant is significantly more cost-effective than the provisioned alternative, until the daily query volume becomes very high (1GB: 4120 queries/day, 3GB: 2424 queries/day, 6GB: 1270 queries/day). This highlights the significant cost savings which a lightweight serverless inference solution can provide.

9.4.3 Fully serverless real-time inference: performance

We next consider the cost-to-performance ratio of each serverless solution. As illustrated in Fig. 6, while 1GB is clearly the cheapest solution, it is also the least performant. We identify the 3GB configuration as a strong compromise of cost and performance, for both cold and warm inference requests. Its end-to-end inference runtimes are significantly faster than 1GB, and are on par with 6GB for all tests. It also costs significantly less than 3x as much as 1GB; this is because Lambda functions are billed by the second of active runtime, so improving the computational efficiency can also reduce costs. While the 6GB solution is usually the most performant, it typically only achieves small improvements over 3GB while incurring significantly higher costs.

However, it should be noted that all three solutions comfortably satisfy the real-time inference requirements described in Sections 3.2 and 9.3.5. Therefore, even the 1GB solution is a viable option for deployment, particularly if the inference workload will predominantly focus on shorter forecasting scales (1-3hrs ahead). However, if responsiveness is of especially important (e.g., if queries are performed by members of the public via a mobile app), and/or many queries will target the longer-to-execute scales, then 3GB is likely a more suitable configuration.

10 Conclusion

In this work, we develop Foresight Plus (in collaboration with Transport for the West Midlands), which builds upon the existing Foresight system to enhance its real-time spatio-temporal forecasting capabilities. We present a novel method for extending the forecasting scale, enabling predictions to be made 1, 3, 6, 12, 24, 36 or 48 hours ahead. This is a significant improvement when compared to the 1 hour ahead scale which is commonly seen in forecasting literature (and was studied in Foresight). It greatly improves the utility of the system for users who require accurate forecasts multiple hours in advance. Our experimental analysis shows that GNN forecasting models such as DCRNN can achieve impressive performance under extended forecasting horizons, with MAPE frequently reducing as the forecasting scale increases. These results highlight the effectiveness of Foresight Plus as both a short and longer-term forecasting system. Further, we consider the fact that many inference workloads will be sporadic in nature, with queries arriving at irregular intervals and being spread over multiple different predictive models/scales. We identify that a fully serverless, pay-as-you-use inference solution is far more cost-effective for such workloads than a provisioned approach (as was utilized in Foresight). We develop an optimized serverless inference procedure in Foresight Plus, and evaluate its scalability, cost and performance across multiple resource configurations. Further work could evaluate the robustness of additional forecasting model types to extensions in the forecasting scale. Also, additional optimisations could be made to accelerate queries made in quick succession (e.g., caching of previous results for a short period of time). The system is generalizable to any similar setting. The dataset used in this study, although relatively complete when compared to benchmark traffic datasets, is not particularly specialized or feature-rich. Any transport authority with a sensor network incorporating ANPR capabilities (which enable the capture of vehicle-level flow information), and with access to sources of DUE data, could readily adopt this cloud-based system. Our incorporation of roadworks data into the predictive model, using simple distance metrics from the sensor network, is a demonstration of how DUE data can improve predictive accuracy. Indeed, there is much scope for leveraging other types of exogenous datasets into such solutions.

Code Availability

The data that support the findings of this study are managed by Transport for the West Midlands, and restrictions apply to the availability of these data, which were used under licence for the current study, and so are not publicly available.

Notes

This work extends the workshop paper ‘Real-Time Spatio-Temporal Forecasting with Dynamic Urban Event and Vehicle-Level Flow Information’, presented at the 5th International Workshop on ‘Big Mobility Data Analytics’ BMDA @ EDBT 2023, which described work at the University of Warwick, prior to GV Demirci joining Imagination Technologies, A Sfyridis joining Imperial College London, and H Ferhatosmanoglu joining Amazon Web Services. Work in this publication is not affiliated with Imagination Technologies, Imperial College London or Amazon Web Services.
https://findtransportdata.dft.gov.uk/dataset/roadworks-service-api-street-manager
https://aws.amazon.com/lambda/
https://aws.amazon.com/sagemaker/pricing/
https://aws.amazon.com/lambda/pricing/
https://aws.amazon.com/kinesis/data-firehose/
https://parquet.apache.org
https://aws.amazon.com/glue/
https://aws.amazon.com/athena/
https://aws.amazon.com/cloudwatch/
https://aws.amazon.com/s3/
https://aws.amazon.com/sagemaker/
https://aws.amazon.com/codecommit/
$p = 1, d = 0, q = 1$
https://aws.amazon.com/sagemaker/pricing/
https://aws.amazon.com/blogs/compute/operating-lambda-performance-optimization-part-1/

References

Alajali W, Zhou W, Wen S, et al (2018) Intersection traffic prediction using decision tree models. Symmetry 10(9). https://doi.org/10.3390/sym10090386, https://www.mdpi.com/2073-8994/10/9/386
Alghamdi T, Elgazzar K, Bayoumi M, et al (2019) Forecasting traffic congestion using ARIMA modeling. In: 2019 15th international wireless communications and mobile computing conference (IWCMC), pp 1227–1232, https://doi.org/10.1109/IWCMC.2019.8766698
Andrade MdF, Ynoue RY, Freitas ED et al (2015) Air quality forecasting system for southeastern brazil. Front Environ Sci 3. https://doi.org/10.3389/fenvs.2015.00009, https://www.frontiersin.org/articles/10.3389/fenvs.2015.00009
Ümit A (2022) Forecasting of transportation-related energy demand and CO2 emissions in turkey with different machine learning algorithms. Sustain Prod Consum 29:141–157. https://doi.org/10.1016/j.spc.2021.10.001, https://www.sciencedirect.com/science/article/pii/S2352550921002840
Barredo-Arrieta A, Laña I, Del Ser J (2019) What lies beneath: a note on the explainability of black-box machine learning models for road traffic forecasting. In: 2019 IEEE intelligent transportation systems conference (ITSC), pp 2232–2237.https://doi.org/10.1109/ITSC.2019.8916985
Bogaerts T, Masegosa AD, Angarita-Zapata JS et al (2020) A graph CNN-LSTM neural network for short and long-term traffic forecasting based on trajectory data. Transp Res Part C Emerg Technol 112:62–77. https://doi.org/10.1016/j.trc.2020.01.010, https://www.sciencedirect.com/science/article/pii/S0968090X19309349
Bruwer MM, Andersen SJ, Merrick W (2021) Measuring the impact of roadworks on traffic progression using floating car data. In: virtual south african transport conference 2021, http://hdl.handle.net/2263/82430
Brügmann J, Schreckenberg M, Luther W (2013) Real-time traffic information system using microscopic traffic simulation. In: 2013 8th EUROSIM congress on modelling and simulation, pp 448–453.https://doi.org/10.1109/EUROSIM.2013.83
Bui KHN, Cho J, Yi H (2022) Spatial-temporal graph neural network for traffic forecasting: an overview and open research issues. Appl Intell 52(3):2763–2774. https://doi.org/10.1007/s10489-021-02587-w
Article Google Scholar
Chen W, Chen L, Xie Y et al (2020) Multi-range attentive bicomponent graph convolutional network for traffic forecasting. Proc AAAI Conf Artif Intell 34(04):3529–3536. https://doi.org/10.1609/aaai.v34i04.5758, https://ojs.aaai.org/index.php/AAAI/article/view/5758
Chen X, Wang J, Xie K (2021) TrafficStream: a streaming traffic flow forecasting framework based on graph neural networks and continual learning. In: proceedings of the thirtieth international joint conference on artificial intelligence, IJCAI 2021, virtual event / montreal, Canada, 19-27 August 2021. ijcai.org, pp 3620–3626. https://doi.org/10.24963/ijcai.2021/498
Conlan C, Oakley J, Demirci GV, et al (2023) Real-time spatio-temporal forecasting with dynamic urban event and vehicle-level flow information. In: proceedings of the workshop on big mobility data analytics (BMDA) co-located with EDBT/ICDT 2023 joint conference, https://wrap.warwick.ac.uk/175521/
De Luca G, Gallo M (2017) Artificial neural networks for forecasting user flows in transportation networks: literature review, limits, potentialities and open challenges. In: 2017 5th IEEE international conference on models and technologies for intelligent transportation systems (MT-ITS), pp 919–923.https://doi.org/10.1109/MTITS.2017.8005644
Diao Z, Wang X, Zhang D et al (2019) Dynamic spatial-temporal graph convolutional neural networks for traffic forecasting. Proc AAAI Conf Artif Intell 33(01):890–897. https://doi.org/10.1609/aaai.v33i01.3301890, https://ojs.aaai.org/index.php/AAAI/article/view/3877
Dong X, Lei T, Jin S, et al (2018) Short-term traffic flow prediction based on XGBoost. In: 2018 IEEE 7th data driven control and learning systems conference (DDCLS), pp 854–859. https://doi.org/10.1109/DDCLS.2018.8516114
Duan H, Xiao X, Pei L et al (2017) Forecasting the short-term traffic flow in the intelligent transportation system based on an inertia nonhomogenous discrete gray model. Complexity 2017. https://doi.org/10.1155/2017/3515272
Duan P, Mao G, Yue W, et al (2018) A unified STARIMA based model for short-term traffic flow prediction. In: 2018 21st international conference on intelligent transportation systems (ITSC), pp 1652–1657. https://doi.org/10.1109/ITSC.2018.8569964
Essien A, Petrounias I, Sampaio P et al (2021) A deep-learning model for urban traffic flow prediction with traffic events mined from twitter. World wide web 24(4):1345–1368. https://doi.org/10.1007/s11280-020-00800-3
Article Google Scholar
Franceschi L, Niepert M, Pontil M, et al (2019) Learning discrete structures for graph neural networks. 36th international conference on machine learning, ICML 2019, 2019-June: 3481–3493. https://doi.org/10.48550/arxiv.1903.11960, arXiv:arxiv1903.11960
Fu R, Zhang Z, Li L (2016) Using LSTM and GRU neural network methods for traffic flow prediction. In: 2016 31st youth academica annual conference of chinese association of automation (YAC), pp 324–328, https://doi.org/10.1109/YAC.2016.7804912
Haworth J, Cheng T (2012) Non-parametric regression for space-time forecasting under missing data. Comput Environ Urban Syst Spec Issue Adv Geocomputation 36(6):538–550. https://doi.org/10.1016/j.compenvurbsys.2012.08.005, https://www.sciencedirect.com/science/article/pii/S0198971512000816,
Held L, Meyer S, Bracher J (2017) Probabilistic forecasting in infectious disease epidemiology: the 13th Armitage lecture. Stat Med 36(22):3443–3460. https://doi.org/10.1002/sim.7363, https://onlinelibrary.wiley.com/doi/abs/10.1002/sim.7363
Hong T, Pinson P (2019) Energy forecasting in the big data world. Int J Forecast 35(4):1387–1388. https://doi.org/10.1016/j.ijforecast.2019.05.004
Article Google Scholar
Jagadish HV, Gehrke J, Labrinidis A et al (2014) Big data and its technical challenges. Commun ACM 57(7):86–94. https://doi.org/10.1145/2611567
Article Google Scholar
Jiang R, Yin D, Wang Z, et al (2021) DL-Traff: survey and benchmark of deep learning models for urban traffic prediction. In: proceedings of the 30th ACM international conference on information & knowledge management. association for computing machinery, new york, NY, USA, CIKM ’21, pp 4515–4525. https://doi.org/10.1145/3459637.3482000
Jiang R, Cai Z, Wang Z, et al (2022) DeepCrowd: a deep model for large-scale citywide crowd density and flow prediction (extended abstract). In: 2022 IEEE 38th international conference on data engineering (ICDE), pp 1519–1520.https://doi.org/10.1109/ICDE53745.2022.00136
Jiang R, Wang Z, Yong J, et al (2023) Spatio-temporal meta-graph learning for traffic forecasting. In: proceedings of the thirty-seventh AAAI conference on artificial intelligence and thirty-fifth conference on innovative applications of artificial intelligence and thirteenth symposium on educational advances in artificial intelligence. AAAI press, AAAI’23/IAAI’23/EAAI’23. https://doi.org/10.1609/aaai.v37i7.25976
Jiang W, Luo J (2022) Graph neural network for traffic forecasting: a survey. Expert systems with applications 207(C). https://doi.org/10.1016/j.eswa.2022.117921
Jiang W, Luo J, He M, et al (2023) Graph neural network for traffic forecasting: the research progress. ISPRS international journal of geo-information 12(3). https://doi.org/10.3390/ijgi12030100, https://www.mdpi.com/2220-9964/12/3/100
Lablack M, Shen Y (2023) Spatio-temporal graph mixformer for traffic forecasting. Expert systems with applications 228(C). https://doi.org/10.1016/j.eswa.2023.120281
Lehmann R, Wohlrabe K (2015) Forecasting GDP at the regional level with many predictors. Ger Econ Rev 16(2):226–254. https://doi.org/10.1111/geer.12042, https://onlinelibrary.wiley.com/doi/abs/10.1111/geer.12042
Li KL, Zhai CJ, Xu JM (2017) Short-term traffic flow prediction using a methodology based on ARIMA and RBF-ANN. In: 2017 chinese automation congress (CAC), pp 2804–280. https://doi.org/10.1109/CAC.2017.8243253
Li Y, Shahabi C (2018) A brief overview of machine learning methods for short-term traffic forecasting and future directions. SIGSPATIAL Spec 10(1):3–9. https://doi.org/10.1145/3231541.3231544
Article Google Scholar
Li Y, Yu R, Shahabi C, et al (2018) Diffusion convolutional recurrent neural network: data-driven traffic forecasting. In: international conference on learning representations (ICLR ’18), https://doi.org/10.48550/arxiv.1707.01926, arXiv:1707.01926
Lu M, Zhang K, Liu H, et al (2019) Graph hierarchical convolutional recurrent neural network (GHCRNN) for vehicle condition prediction. CoRR abs/1903.06261. arXiv:1903.06261
Lu Y, Kamranfar P, Lattanzi D, et al (2021) Traffic flow forecasting with maintenance downtime via multi-channel attention-based spatio-temporal graph convolutional networks. https://doi.org/10.48550/arxiv.2110.01535, arXiv:2110.01535
Luo X, Li D, Yang Y et al (2019) Spatiotemporal traffic flow prediction with KNN and LSTM. J Adv Transp 2019:4145353. https://doi.org/10.1155/2019/4145353
Article Google Scholar
Lv Y, Duan Y, Kang W et al (2015) Traffic flow prediction With big data: a deep learning approach. IEEE Trans Intell Transp Syst 16(2):865–873. https://doi.org/10.1109/TITS.2014.2345663
Article Google Scholar
Ma D, Song X, Li P (2021) Daily traffic flow forecasting through a contextual convolutional recurrent neural network modeling inter- and intra-day traffic patterns. IEEE Trans Intell Transp Syst 22(5):2627–2636. https://doi.org/10.1109/TITS.2020.2973279
Article Google Scholar
Ma J, Chan J, Ristanoski G et al (2019) Bus travel time prediction with real-time traffic information. Transp Res Part C Emerg Technol 105:536–549. https://doi.org/10.1016/j.trc.2019.06.008, https://www.sciencedirect.com/science/article/pii/S0968090X18309082
Marécal V, Peuch VH, Andersson C, et al (2015) A regional air quality forecasting system over europe: the MACC-II daily ensemble production. geoscientific model development 8(9):2777–2813. https://doi.org/10.5194/gmd-8-2777-2015, https://gmd.copernicus.org/articles/8/2777/2015/
Milkovits M, Huang E, Antoniou C, et al (2010) DynaMIT 2.0: the next generation real-time dynamic traffic assignment system. In: 2010 second international conference on advances in system simulation, pp 45–51. https://doi.org/10.1109/SIMUL.2010.28
Mohammed O, Kianfar J (2018) A machine learning approach to short-term traffic flow prediction: a case study of interstate 64 in missouri. In: 2018 IEEE international smart cities conference (ISC2), pp 1–7. https://doi.org/10.1109/ISC2.2018.8656924
Müller I, Marroquín R, Alonso G (2020) Lambada: interactive data analytics on cold data using serverless cloud infrastructure. In: proceedings of the 2020 ACM SIGMOD international conference on management of data. Association for computing machinery, New York, NY, USA, SIGMOD ’20, pp 115–130. https://doi.org/10.1145/3318464.3389758
Pan Z, Liang Y, Wang W, et al (2019) urban traffic prediction from spatio-temporal data using deep meta learning. In: proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & Data mining. association for computing machinery, new york, NY, USA, KDD ’19, pp 1720–1730. https://doi.org/10.1145/3292500.3330884
Prapas I, Derakhshan B, Mahdiraji AR et al (2021) Continuous training and deployment of deep learning models. Datenbank-Spektrum 21(3):203–212. https://doi.org/10.1007/s13222-021-00386-8
Article Google Scholar
Sakr M, Ray C, Renso C (2022) big mobility data analytics: recent advances and open problems. GeoInformatica 26(4):541–549. https://doi.org/10.1007/s10707-022-00483-0
Article Google Scholar
Schwietering C, Feldges M (2016) Improving traffic flow at long-term roadworks. Transportation research procedia international symposium on enhancing highway performance (ISEHP), June 14–16, 2016. Berlin 15:267–282. https://doi.org/10.1016/j.trpro.2016.06.023, https://www.sciencedirect.com/science/article/pii/S235214651630552X,
Shafiei H, Khonsari A, Mousavi P (2022) Serverless computing: a survey of opportunities, challenges, and applications. ACM computing survey 54(11s). https://doi.org/10.1145/3510611
Shah SY, Patel D, Vu L, et al (2021) AutoAI-TS: autoAI for time series Forecasting. In: proceedings of the 2021 international conference on management of data. Association for computing machinery, new york, NY, USA, SIGMOD ’21, pp 2584–2596. https://doi.org/10.1145/3448016.3457557
Shao Z, Zhang Z, Wei W, et al (2022) Decoupled dynamic spatial-temporal graph neural network for Traffic forecasting. Proceedings of the VLDB endowment 15(11):2733–2746. https://doi.org/10.14778/3551793.3551827
Shleifer S, McCreery C, Chitters V (2019) Incrementally improving graph waveNet performance on traffic prediction. https://doi.org/10.48550/arxiv.1912.07390, arXiv:1912.07390
Stübinger J, Schneider L (2020) Epidemiology of coronavirus COVID-19: forecasting the future incidence in different countries. Healthcare 8(2). https://doi.org/10.3390/healthcare8020099, https://www.mdpi.com/2227-9032/8/2/99
Tampubolon H, Hsiung PA (2018) Supervised deep learning based for traffic flow prediction. In: 2018 international conference on smart green technology in electrical and information systems (ICSGTEIS), pp 95–100. https://doi.org/10.1109/ICSGTEIS.2018.8709102
Tedjopurnomo DA, Bao Z, Zheng B et al (2022) A survey on modern deep neural network for traffic prediction: trends, methods and challenges. IEEE Trans Knowl Data Eng 34(4):1544–1561. https://doi.org/10.1109/TKDE.2020.3001195
Article Google Scholar
Tian Y, Pan L (2015) Predicting short-term traffic flow by long short-term memory recurrent neural network. In: 2015 IEEE international conference on smart city/socialcom/sustaincom (smartcity), pp 153–158, https://doi.org/10.1109/SmartCity.2015.63
Tian Y, Zhang K, Li J et al (2018) LSTM-based traffic flow prediction with missing data. Neurocomputing 318:297–305. https://doi.org/10.1016/j.neucom.2018.08.067, https://www.sciencedirect.com/science/article/pii/S0925231218310294
Tran L, Mun MY, Lim M, et al (2020) DeepTRANS: a deep learning system for public bus travel time estimation using traffic forecasting. Proceedings of the VLDB endowment 13(12):2957–2960. https://doi.org/10.14778/3415478.3415518
Wang C, Li C, Huang H et al (2023) ASNN-FRR: a traffic-aware neural network for fastest route recommendation. GeoInformatica 27(1):39–60. https://doi.org/10.1007/s10707-021-00458-7
Article Google Scholar
Wang X, Ma Y, Wang Y, et al (2020) Traffic flow prediction via spatial temporal graph neural network. In: proceedings of the web conference 2020. Association for computing machinery, new york, NY, USA, WWW ’20, pp 1082–1092. https://doi.org/10.1145/3366423.3380186
Wu T, Chen F, Wan Y (2018) Graph attention LSTM network: a new model for traffic flow forecasting. In: 2018 5th international conference on information science and control engineering (ICISCE), pp 241–245. https://doi.org/10.1109/ICISCE.2018.00058
Wu Z, Pan S, Long G, et al (2020) Connecting the dots: multivariate time series forecasting with graph neural networks. In: proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining. association for computing machinery, new york, NY, USA, KDD ’20, pp 753–763. https://doi.org/10.1145/3394486.3403118
Xu D, Shi Y (2017) A combined model of random forest and multilayer perceptron to forecast expressway traffic flow. In: 2017 7th IEEE international conference on electronics information and emergency communication (ICEIEC), pp 448–451. https://doi.org/10.1109/ICEIEC.2017.8076602
Yeghikyan G, Opolka FL, Nanni M, et al (2020) Learning mobility flows from urban features with spatial interaction models and neural networks. In: 2020 IEEE international conference on smart computing (SMARTCOMP), pp 57–64. https://doi.org/10.1109/SMARTCOMP50058.2020.00028
Yin D, Jiang R, Deng J et al (2023) MTMGNN: Multi-time multi-graph neural network for metro passenger flow prediction. GeoInformatica 27(1):77–105. https://doi.org/10.1007/s10707-022-00466-1
Article Google Scholar
Yu JJQ, Markos C, Zhang S (2022) Long-term urban traffic speed prediction with deep learning on graphs. IEEE Trans Intell Transp Syst 23(7):7359–7370. https://doi.org/10.1109/TITS.2021.3069234
Article Google Scholar
Zhang C, Patras P (2018) Long-term mobile traffic forecasting using deep spatio-temporal neural networks. In: proceedings of the eighteenth ACM international symposium on mobile ad hoc networking and computing. Association for computing machinery, new york, NY, USA, Mobihoc ’18, pp 231–240. https://doi.org/10.1145/3209582.3209606
Zhao L, Song Y, Zhang C et al (2020) T-GCN: a temporal graph convolutional network for traffic prediction. IEEE Trans Intell Transp Syst 21(9):3848–3858. https://doi.org/10.1109/TITS.2019.2935152
Article Google Scholar
Zhao L, Chen M, Du Y, et al (2022) Spatial-temporal graph convolutional gated recurrent network for traffic forecasting. https://doi.org/10.48550/arxiv.2210.02737, arXiv:2210.02737
Zhaowei Q, Haitao L, Zhihui L et al (2022) Short-term traffic flow forecasting method With M-B-LSTM hybrid network. IEEE Trans Intell Transp Syst 23(1):225–235. https://doi.org/10.1109/TITS.2020.3009725
Article Google Scholar
Zheng Z, Shi L, Sun L et al (2020) Short-term traffic flow prediction based on sparse regression and spatio-temporal data fusion. IEEE Access 8:142111–142119. https://doi.org/10.1109/ACCESS.2020.3013010
Article Google Scholar
Zhong W, Mallick T, Meidani H, et al (2022) Explainable graph pyramid autoformer for long-term traffic forecasting. https://doi.org/10.48550/ARXIV.2209.13123, arXiv:2209.13123

Download references

Acknowledgements

We thank Transport for the West Midlands and our collaborators there, particularly Tim Katheru, Andrew Burns and Stuart Lester in the Data Insights team. This research is also supported in part by the Feuer International Scholarship in Artificial Intelligence, WM5G, and the UK Engineering and Physical Sciences Research Council under Grant No. EP/L016400/1.

Author information

Hakan Ferhatosmanoglu is also with Amazon Web Services.

Authors and Affiliations

Department of Computer Science, University of Warwick, Coventry, United Kingdom
Joe Oakley, Chris Conlan & Hakan Ferhatosmanoglu
Imagination Technologies, London, United Kingdom
Gunduz Vehbi Demirci
Imperial College London, London, United Kingdom
Alexandros Sfyridis

Authors

Joe Oakley
View author publications
You can also search for this author in PubMed Google Scholar
Chris Conlan
View author publications
You can also search for this author in PubMed Google Scholar
Gunduz Vehbi Demirci
View author publications
You can also search for this author in PubMed Google Scholar
Alexandros Sfyridis
View author publications
You can also search for this author in PubMed Google Scholar
Hakan Ferhatosmanoglu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Joe Oakley.

Ethics declarations

Conflicts of interest

The authors have no competing interests to declare that are relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Oakley, J., Conlan, C., Demirci, G.V. et al. Foresight plus: serverless spatio-temporal traffic forecasting. Geoinformatica (2024). https://doi.org/10.1007/s10707-024-00517-9

Download citation

Received: 13 September 2023
Revised: 19 January 2024
Accepted: 09 April 2024
Published: 26 April 2024
DOI: https://doi.org/10.1007/s10707-024-00517-9

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Foresight plus: serverless spatio-temporal traffic forecasting

Abstract

Similar content being viewed by others

Last-mile delivery concepts: a survey from an operational research perspective

Big Data Analytics in Weather Forecasting: A Systematic Review

CloudAIBus: a testbed for AI based cloud computing environments

1 Introduction

2 Related work

2.1 Dynamic urban events

2.2 GNN adjacency matrix

2.3 Forecasting systems

3 Real-time spatio-temporal forecasting

3.1 Traffic forecasting problem

3.2 Real-time forecasting

4 Dynamic urban events

5 Flow aggregated adjacency matrix

6 Foresight plus: extending the forecasting scale

7 Foresight plus: cost-effective serverless inference for sporadic workloads

7.1 Serverless inference cost model

7.2 Enhanced prediction flexibility

7.3 Serverless architecture

8 Foresight plus system architecture

8.1 Streaming data ingestion, aggregation and storage

8.2 MLOps suite, training and inference

9 Experimental analysis

9.1 Dataset and experimental setup

9.1.1 Road camera dataset

9.1.2 Experimental setup

9.2 Forecasting models

9.3 Experimental results: foresight

9.3.1 Analysis of existing approaches

9.3.2 DUE analysis

9.3.3 FAAM analysis

9.3.4 Error coefficient of variation

9.3.5 Efficiency analysis

9.4 Experimental results: foresight plus

9.4.1 Extending the forecasting scale

9.4.2 Fully serverless real-time inference: cost

9.4.3 Fully serverless real-time inference: performance

10 Conclusion

Code Availability

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation