1 Introduction

Local traveling speed measurements and predictions provide the basis for many vehicle routing and traveling time prediction algorithms. The latter can be found in navigation devices used in private vehicles as well as in logistic applications. Predictions are especially relevant for congested networks [16]. While navigation devices increasingly use real time information, logistics applications typically used in the planning stage (pre-trip) rely on long term (in the sense of several hours or days ahead) route traveling time predictions. Such predictions may be derived from link traveling time predictions based on local (to the links in the network) traveling speed predictions.

Traveling speed predictions have been derived from either local measurements, taken from stationary road side sensors (for a survey see e.g. [1]), or from floating sensors distributed to a fleet of vehicles (from a long list of contributions see e.g. [3, 4, 10]). There exists an extensive literature dealing with traveling time estimation; see the collection of survey papers in [2] or the literature reviews in [13] and [23] amongst others. All these approaches presuppose large databases in order to yield accurate predictions for all possible routes in a given city and for a given time of the day (see e.g. [11] and [18]).

Two main hurdles for providing such a large sample exist. First, only a subset of the road links can be equipped with road side sensors due to cost restrictions. Second, the data collecting fleet vehicles do not traverse all road links at all times. Especially those links located on the outskirts of a city are typically neglected. Consequently, many cities currently lack sufficiently large data sets required by the mentioned approaches. In order to fill the gaps in the data, one can implement an extensive measurement campaign to cover missing road links. An alternative option—a backup strategy—is to estimate free flow speeds based on static data. Moses and Mtoi [15] propose different models for estimating free flow speeds based on the speed limit, spacing between signalized intersections and vehicle type. Transportation Research Board [20] and [5] consider additional adjustment factors for structural road parameters and vehicle types. Tseng et al. [21] extend these methods to suburban highways. Graser et al. [8] and [9] suggest the usage of different road network centrality measures as predictors for link traveling speed. However, these authors acknowledge that some of the measures depend on the boundaries of the chosen map which introduces some arbitrariness to the approach.

Most of these basic approaches do not address the presence of daily variation of traveling speeds in sufficient detail, or even neglect it completely. Leodolter et al. [12] highlight the dramatic losses in prediction accuracy resulting from neglecting the daily variation.

In general, one should expect decreased traveling speeds in morning and evening rush hours. However, more precise information on the daily variation of local traveling speeds must be inferred from data. Nonetheless, it may be postulated that daily variation patterns are very similar for road links of similar usage (in a given city). Herein, road usage is operationalized as functional road categories. Moreover, these patterns may be similar for similar cities, which suggests the possibility of some form of transfer across cities. Leodolter et al. [12] illustrate this idea by transferring several models fitted on data from Vienna, Austria, to the nearby city Linz, Austria. In particular, this amounts to using the daily variation of one city as a proxy for the daily variation in a different city and is shown to improve the predictive power in comparison to approaches not taking daily variation into account. Hence, this method couples the simplicity and low costs of the mentioned backup approaches with the accuracy gained from using local measurements.

In this paper, we refine and extend the method presented in [12] in several respects. Most notably, we provide a more careful modeling of the daily variation patterns by allowing these profiles to differ across road types. Earlier approaches neglect dependence between different time-of-the-day intervals and hence result in noisy daily profiles. This problem becomes more severe when increasing the model complexity by allowing daily profiles to change with road type (as we do). Here, we propose to use penalized least-squares methods, which are already popular in non-parametric (spline-based) estimation. Penalizing the roughness of the daily variation patterns lowers the estimation uncertainty and yields smooth estimated daily profiles. We include details on an efficient numerical implementation of our approach in the Appendix. In addition, we observe that the similarities of the daily profiles across cities concentrate in low dimensional subspaces. We enhance the cross-city prediction approach of [12] to exploit this finding. Using these new developments, we comprehensively evaluate the effectiveness of a wide range of predictive models with respect to on-site performance (predictive power for the city for which data is available) and cross-city transfer. For the on-site comparisons we use data from the two Austrian cities Vienna and Linz as well as the French city Lyon. The cross-city performance is evaluated for the transfer from Vienna to Linz, Austria, and from Vienna to Lyon, France. The choice of cities here is due to data availability; all methods can be applied without change to other cities/countries as long as adequate data is available.

The remainder of this paper is organized as follows: Section 2 provides background information on the data set used for accessing the predictive power in subsequent sections. Subsequently, Section 3 presents the models and estimation techniques as well as the formulas used for prediction. Section 4 details on the evaluation methodology. Section 5 discusses the evaluation results. Section 6 concludes the paper. The Appendix provides some details for practical implementation of our methods.

2 Data & descriptive analysis

The floating car data (FCD) used in our study is similar to that in [12]. More specifically, the FCD is collected by about 3500 registered taxis in the region of Vienna (Austria), about 300 taxis in the region of Linz (Austria), and about 400 taxis in the region of Lyon (France).

The FCD raw data consists of anonymized vehicle trajectories (time and GPS position) with a variable sampling interval (between 10 and 60 seconds). In the FCD processing, the vehicle trajectories are first map-matched to an OpenStreetMap road network graph of the region. During map-matching, the most probable road link for every GPS position is identified (by taking into account the great-circle distance between GPS position and road geometry as well as the estimated heading of the vehicle). Positions without plausible map-matchings are discarded from the trajectory. Then, the covered road distance between two consecutive map-matched positions is determined by a shortest-path routing on the road network. The quotient of the covered distance and the time elapsed between two GPS measurements provides one speed observation. The latter is assigned to the respective links by linear interpolation. Finally, trajectories corresponding to implausible speed observations of more than 110% of the link speed limit are discarded.

Each of the three city-specific data sets covers a period of one year with about 3 million speed measurements per day in Vienna, about 700,000 measurements per day in Linz, and about 800,000 measurements per day in Lyon. Data from irregular days (public holidays, weekends, school holiday, etc.) is discarded as well as observations concerning links with a speed limit below 20 km/h. In particular, our analysis focuses exclusively on weekdays during the school period. Arguably, this time features the most severe congestion delays and is therefore most critical for traveling speed prediction.

Time is expressed in the form of 96 intervals of 15 minutes covering the 24 hours of a day. Speed measurements pertaining to a given road link, day, and time interval are averaged (using harmonic averages) to form a single observation. This leads to roughly 2.5 million (M) observations for Viennese road links, roughly 1.1 M. observations for road links in Linz, and roughly 1.25 M. observations for road links in Lyon. The associated speed limits and road classification information are taken from OpenStreetMap. Finally, every road link is assigned a functional road classification (frc) number on the basis of its OpenStreetMap highway tag. These numbers indicate the road type and range from one (motorways) to eight (living streets). As a rough guide, the importance of road links decreases as the frc number increases.

For preliminary descriptive analysis, the regression model

$$\text{\texttt{sp}}_{i} = c_{f(i)} +\beta_{f(i)}\text{\texttt{mxsp}}_{i}+v_{i} $$

is fitted by least-squares to the first half of each city-specific data set. Herein, sp i , f(i), and mxsp i denote the traveling speed (average), road type (frc), and speed limit of the i-th observation, respectively. In particular, intercept and slope coefficients are allowed to change across road types (frc). Next, prediction errors \(\hat v_{i}\) are calculated for each city based on the second half of the respective data set. Figure 1 shows the averagesFootnote 1 of the prediction errors over each of the 96 time intervals and for frc numbers 1, 4, and 7. The shown daily variation patterns reflect the frc classifications: highways (frc = 1) exhibit strong signs of congestion during morning and evening peaks, which fade during midday; medium size roads (frc = 4) show a clear night/day divide, but of lesser extent than highways; finally, living streets (frc = 7) show only little daily variation.

Fig. 1
figure 1

The figure shows the (estimated) daily variation of prediction errors from a regression model for link traveling speed in km/h with road type (frc) specific intercepts and speed limit slope coefficients for three road types (1, 4, 7) and three cities (Vienna, Linz, and Lyon)

Figure 1 also reveals considerable differences in the daily variation patterns across cities, which raises concerns about the approach of [12] of using the same pattern for all cities. Figure 2 further investigates this issue. To this end, we arrange the prediction error averages in matrices \(\hat {\boldsymbol {\Gamma }}_{q}=[\hat \gamma _{t,j| q}] \in \mathbb R^{96 \times 8}\), wherein \(\hat \gamma _{t,j| q}\) denotes the average pertaining to time interval t, road type j (frc), and city q∈{Vienna,Linz,Lyon}. Hence, the j-th column of \(\hat {\boldsymbol {\Gamma }}_{q}\) embodies the estimated daily variation for the j-th road type (frc) in city q. Insufficient data on frc class 2 for Linz leads to several missing entries in the second column of \(\hat {\boldsymbol {\Gamma }}_{\text {Linz}}\). Therefore we nullify the second column of \(\hat {\boldsymbol {\Gamma }}_{q}\) for all three cities to ensure comparability. Panels (a)–(d) of Fig. 2 show the first four left singular vectors \(\hat {\mathbf {u}}_{1| q}, \dots , \hat {\mathbf {u}}_{4| q}\) of the 96 × 8 matricesFootnote 2 \(\hat {\boldsymbol {\Gamma }}_{q}={\sum }_{i\leq 7} \hat \sigma _{i| q}\hat {\mathbf {u}}_{i| q}\hat {\mathbf {v}}_{i| q}^{\mathsf {T}}\), wherein \(\hat \sigma _{i| q}\) and \(\hat {\mathbf {v}}_{i| q}\) denote the i-th singular value and i-th right singular vector of \(\hat {\boldsymbol {\Gamma }}_{q}\), respectively.

Fig. 2
figure 2

Panels (a)–(d) show the first four left singular vectors \(\hat {\mathbf {u}}_{1| q},\dots \hat {\mathbf {u}}_{4|q}\) of the estimated 96 × 8 daily variation matrices \(\hat {\boldsymbol {\Gamma }}_{q}\) for three cities q ∈ {Vienna, Linz, Lyon}. Panels (e)–(h) show the corresponding loading vectors \(\hat {\sigma }_{1|q}\hat {\mathbf {v}}_{1| q},\dots , \hat {\sigma }_{4| q}\hat {\mathbf {v}}_{4| q}\) calculated by multiplication of the respective singular value \(\hat {\sigma }_{i| q}\) and right singular vector \(\hat {\mathbf {v}}_{i| q}\). Roads of type (frc) 2 are excluded from the estimation due to lack of corresponding data. The zero loadings for road type 2 in panels (e)–(h) reflect this omission

We observe that the first two left singular vectors shown in panel (a) and (b) are quite similar across cities, but one distinct feature of Lyon stands out. These two signals mostly represent the decline of average traveling speed during the day in panel (a) as well as the morning and evening peak in panel (b), respectively. In particular, the second left singular vector for Lyon reflects the considerable difference in magnitude between the morning and evening peak shown in Fig. 1. In contrast, the two peaks are quite similar for Vienna and Linz, which manifests in the divergence of the three left singular vectors in panel (b) during this time. Panel (c) and (d) reveal considerable differences between the left singular vectors \(\hat {\mathbf {u}}_{3| q}\), \(\hat {\mathbf {u}}_{4| q}\) across cities; \(\hat {\mathbf {u}}_{5| q}, \dots , \hat {\mathbf {u}}_{7| q}\)—not shown—exhibit qualitatively comparable differences across cities q as \(\hat {\mathbf {u}}_{3| q}\) and \(\hat {\mathbf {u}}_{4| q}\). Finally, it should be kept in mind that Figs. 1 and 2 show estimates, which are subject to sampling uncertainty.

Panel (e)–(h) of Fig. 2 show the coefficient vectors \(\hat {\boldsymbol {\sigma }}_{i| q} \hat {\mathbf {v}}_{i| q}\) corresponding to the first four left singular vectors \(\hat {\mathbf {u}}_{i| q}\). More specifically, each column of \(\hat {\boldsymbol {\Gamma }}_{q}\)—representing the daily pattern for one road category in form of 96 estimates \(\hat {\gamma }_{t,j| q}\), t ≤ 96—equals a linear combination of the left singular vectors \(\hat {\mathbf {u}}_{1| q}, \dots ,\hat {\mathbf {u}}_{7| q}\) (panel (a)–(d)). The latter can therefore be interpreted as basic daily patterns. The coefficients \(\hat {\boldsymbol {\sigma }}_{i| q}\hat v_{j,i| q}\)—called loadings herein—corresponding to the i-th basic pattern \(\hat {\mathbf {u}}_{i| q}\) and the eight road types j = 1, … , 8 gather in the vector \(\hat {\boldsymbol {\sigma }}_{i| q}\hat {\mathbf {v}}_{i| q}\). The zero loadings \(\hat {\boldsymbol {\sigma }}_{i| q}\hat v_{2,i|q}\) for road type (frc) 2 with respect to all basic patterns i = 1, … , 7 reflect its omission in the estimation. The other loadings express how the respective basic pattern enters the daily variation of the corresponding road type. In case of a positive loading \(\hat {\boldsymbol {\sigma }}_{i| q}\hat v_{j,i| q}\), the basic pattern i enters in the form shown in panels (a)–(d) of Fig. 2. A negative loading implies that the basic pattern is turned upside down. The absolute value of the each loading governs the strength of the respective basic pattern in the daily variation of the corresponding road type. In this regard, we observe two notable features. Firstly, the signs of \(\hat {\boldsymbol {\sigma }}_{1| q}\hat v_{j,1| q}\) and \(\hat {\boldsymbol {\sigma }}_{2| q}\hat v_{j,2| q}\) are identical across cities. Thus, the first two basic patterns enter in the same form across cities q, which affirms the above interpretation of \(\hat {\mathbf {u}}_{1| q}\) and \(\hat {\mathbf {u}}_{2| q}\). However, the magnitude of the loadings differs considerably for some road types, that is, the basic patterns occur with different strength across cities. Secondly, the loadings decrease rapidly as i increases, however, are still of considerable size for some road types and i > 2; see panel (g) and (h).

Finally, we observe that the ratio \(\lVert \hat {\boldsymbol {\Gamma }}_{q}-\tilde {\boldsymbol {\Gamma }}_{\text {Vienna}}\rVert _{\text{\texttt{F}}}\big / \lVert \hat {\boldsymbol {\Gamma }}_{q}- \hat {\boldsymbol {\Gamma }}_{\text {Vienna}}\rVert _{\text{\texttt{F}}}\), wherein \(\tilde {\boldsymbol {\Gamma }}_{\text {Vienna}} \,=\, {\sum }_{i\leq 2} \hat {\boldsymbol {\sigma }}_{i|\text {Vienna}}\hat {\mathbf {u}}_{i|\text {Vienna}}\hat {\mathbf {v}}_{i|\text {Vienna}}^{\mathsf {T}}\) and \(\lVert A \rVert _{\text{\texttt{F}}}^{2}={\sum }_{i,j}a_{i,j}^{2}\) for any matrix A = [a i, j ], equals 82.1% for q = Linz and 98.4% for q = Lyon. Hence, the rank two approximation \(\tilde {\boldsymbol {\Gamma }}_{\text {Vienna}}\) to the Viennese daily variation matrix \(\hat {\boldsymbol {\Gamma }}_{\text {Vienna}}\) is a closer substitute for \(\hat {\boldsymbol {\Gamma }}_{\text {Linz}}\) and \(\hat {\boldsymbol {\Gamma }}_{\text {Lyon}}\) than the full estimate \(\hat {\boldsymbol {\Gamma }}_{\text {Vienna}}\). This finding motivates the alternative prediction strategy proposed in the following Section 3 and evaluated in Sections 4 and 5.

3 Models, estimation & prediction

This paper considers predictions of the traveling speed sp = y (in km/h) and predictions of the ratio sp/mxsp = y′ of traveling speed to speed limit (in km/h) for a given road type (frc) f ∈ {1, … , 8} and time t ∈ {1, … , 96}. We start with two general formulations

$$\begin{array}{@{}rcl@{}} y_{i}&=&\text{\texttt{sp}}_{i} = c_{f(i)} + \gamma_{t(i),f(i)} +\beta_{f(i)}\text{\texttt{mxsp}}_{i}+u_{i}\;\;\text{and} \end{array} $$
(1a)
$$\begin{array}{@{}rcl@{}} y_{i}^{\prime}&=&\frac{\text{\texttt{sp}}_{i}}{\text{\texttt{mxsp}}_{i}} = c_{f(i)}^{\prime}+ \gamma_{t(i),f(i)}^{\prime}+\beta_{f(i)}^{\prime}\text{\texttt{mxsp}}_{i}+u_{i}^{\prime}\;, \end{array} $$
(1b)

wherein t(i) denotes the time-of-the-day interval of observation i. The remainder terms u i and \(u_{i}^{\prime }\) are assumed to be zero mean. The superscript ′ in Eq. 1b acknowledges the possibility of differences in parameter values between Eqs. 1a and 1b. The subsequent discussion is in terms of the former to circumvent superfluous replications. We tested the assumption of linearity in mxsp and found it to be appropriate; see the comment at the end of Section 5.

These general models lead to a number of variations by imposing different restrictions of the frc-specific intercept and slope coefficients c f and β f as well as the daily variations γ t, f . For the slope coefficient β f , we mostly focus on the unrestricted case. Section 5 also comments on the choice of a common slope coefficient β f = β, f = 1,...,8. For the daily variation coefficients γ t, f we allow several (pre-specified) frc-based groups F 0, F 1, ... , F g (partitioning the set {1, ... , 8}) of identical daily variation. More specifically, we use

$$ \gamma_{t,i}\,=\,\bar\gamma_{t,j}\;\text{for all}\; i \in F_{j}\;\text{with}\;\! \sum\limits_{t=1}^{96} \bar\gamma_{t,j}\!= 0,\;\! j\!=1,...,g, $$
(2)

and \(\bar \gamma _{t,0}=0\). Notable special cases of Eq. 2 are the case of no daily variation (F 0 = {1, ... , 8}), a single daily variation pattern (g = 1, F 0 = , F 1 = {1, … , 8}), and the unrestricted case (g = 8, F 0 = , F h = {h}, 1 ≤ h ≤ 8). In the latter case, no restrictions—except \({\sum }_{t\leq 96}\gamma _{t,f}=0\)—are imposed on γ t, f . The parameters can be conveniently collected into the matrix Γ = [γ t, f ].

Additional constraints on Γ allow further reduction in model complexity, e.g., γ t, f being identical at night time, in order to optimize the variance-bias trade-off; we further comment on this in Section 5.

The models are fitted using regularized least-squares. The regularization aims at smoothing the daily variation by adding a penalty term \(\lambda \lVert \boldsymbol {\Delta }_{2}\boldsymbol {\Gamma }\rVert _{\text{\texttt{F}}}^{2}\) to the least-squares objective. Here λ(≥ 0) is the regularization constant, ∥∥denotes the Frobenius norm (square root of the sum of the squares of the matrix entries), and Δ 2 the symmetric circulant matrix

$$\boldsymbol{\Delta}_{2}=\left( \begin{array}{cccccc} 2 & -1 & {\dots} & {\dots} & 0 & -1 \\ -1 & 2 & -1 & {\dots} & 0 & 0 \\ {\vdots} & {\vdots} & {\vdots} & {\ddots} & {\vdots} & {\vdots} \\ -1 & 0 & 0 & {\dots} & -1 & 2 \end{array}\;.\right) $$

This penalty encourages a smooth daily variation estimate; see [6, sec. 4.2] amongst others. The regularization constant λ is either set to zero (no regularization) or chosen by the GCV-criterion [7]. The Appendix further details on an implementation strategy that easily adapts to large-scale applications.

The estimation is carried out for all three cities—Vienna, Linz, and Lyon, separately; an additional subscript on the parameter estimates indicates the respective data source, e.g. \(\hat \gamma _{t,j|\text {Vienna}}\), in case the distinction is important.

Predictions \(\hat y\) either take the form

$$\begin{array}{@{}rcl@{}} \hat y_{i}(p\,|\, q) &=& \hat c_{f(i)| q} + \hat \gamma_{t(i),f(i)| q} + \hat \beta_{f(i)| q}\text{\texttt{mxsp}}_{i} \quad\text{or} \end{array} $$
(3a)
$$\begin{array}{@{}rcl@{}} \hat y_{i}(p\,|\, q) &=& \hat c_{f(i)| q} + \hat \gamma_{t(i),f(i)| q}^{r} + \hat \beta_{f(i)| q}\text{\texttt{mxsp}}_{i}\;, \end{array} $$
(3b)

wherein p, q ∈ {Vienna, Linz, Lyon} indicate the target city for prediction (p) and the data source for estimation (q). A hat superscript identifies least-squares estimates. Moreover, the daily variation coefficient estimate \(\hat {\gamma }_{t,f}^{r}\) in Eq. 3b amounts to the t, f-th entry of the rank r approximation

$$ \tilde{\boldsymbol{\Gamma}_{q}^{r}} = \sum\limits_{i\leq r}\hat{\boldsymbol{\sigma}}_{i| q}\hat{\mathbf{u}}_{i| q}\hat{\mathbf{v}}_{i| q}^{\mathsf{T}} $$

of \(\hat {\boldsymbol {\Gamma }}_{q}\). The latter is calculated based on the singular value decomposition \(\hat {\boldsymbol {\Gamma }}_{q}={\sum }_{i\leq 8}\hat {\boldsymbol {\sigma }}_{i| q}\hat {\mathbf {u}}_{i| q}\hat {\mathbf {v}}_{i| q}^{\mathsf {T}}\) of the estimate \(\hat {\boldsymbol {\Gamma }}_{q}\). Herein r ≤ 8 is pre-specified. The reduction implemented by choosing r < 8 acknowledges the above observation of similarities across cities between the terms \(\hat {\boldsymbol {\sigma }}_{i}\hat {\mathbf {u}}_{i} \hat {\mathbf {v}}_{i}^{\mathsf {T}}\) for i ≤ 2 and dissimilarities for i > 2.

4 Evaluation methodology

The various models are assessed by cross validation. Therein parameter estimation (training) is carried out on a data set from q ∈ {Vienna, Linz, Lyon} of size n e = 500,000. Prediction errors (evaluation) are calculated using a separate data set from p ∈ {Vienna, Linz, Lyon} of size n v = 500,000. That is, the cases p = q and pq refer to on-site prediction and cross-city prediction, respectively. Training and evaluation data sets are randomly drawn from the respective full data sets in such a way that the two data sets are non-overlapping and each of the two has the same share of observations from every time interval as the full data set. This stratification ensures that all time intervals receive appropriate attention. Moreover, the lack of overlap ensures that on-site comparisons rely on out-of-sample predictions, too.

We express the predictive performance in terms of the (estimated) mean absolute percentage error (mape) calculated as

$$\begin{array}{@{}rcl@{}} \widehat{\text{\texttt{mape}}}(p\,|\, q) &=& \frac{1}{n_{v}}\sum\limits_{i = 1}^{n_{v}}\frac{\lvert y_{i}- \hat y_{i}(p\,|\, q)\rvert}{\lvert y_{i}\rvert}\quad\text{and} \end{array} $$
(4a)
$$\begin{array}{@{}rcl@{}} \widehat{\text{\texttt{mape}}^{\prime}}(p\,|\, q) &=& \frac{1}{n_{v}}\sum\limits_{i=1}^{n_{v}} \frac{\lvert y_{i}^{\prime}- \hat y_{i}^{\prime}(p\,|\, q)\rvert\,\lvert\text{\texttt{mxsp}}_{i}\rvert}{\lvert y_{i}\rvert}\;, \end{array} $$
(4b)

respectively. Here, y i = sp i and \(y_{i}^{\prime }=\text{\texttt{sp}}_{i}/\text{\texttt{mxsp}}_{i}\) denote the i-th response observation in an evaluation data set of size n v = 500,000 from p ∈ {Vienna, Linz, Lyon}. In contrast to the root mean squared error used in [12], the criteria in Eqs. 4a and 4b acknowledge that prediction errors \(y-\hat y(p\,|\, q)\) and \((y^{\prime }-\hat y^{\prime }(p\,|\, q))\text{\texttt{mxsp}}\), respectively, occur at different speed levels.

The predictions \(\hat y(p\,|\, q)\) and \(\hat y^{\prime }(p\,|\, q)\) are based on either a special case of Eqs. 1a and 1b, respectively, or one of the additional benchmark strategies outlined below. In the former case, the calculation of predictions proceeds as in Eqs. 3a or 3b.

We repeat the calculation of Eqs. 4a and 4b for M = 50 different, but necessarily overlapping, pairs of a disjoint training sample and an evaluation sample and for (p | q) equal to (Vienna | Vienna), (Linz | Linz), (Lyon | Lyon) (on-site) and (Linz | Vienna), (Lyon | Vienna) (cross-city). The replications reduce the effects of randomly drawing the estimation and prediction subsamples and generate information on the accuracy of the estimated performance measures. Table 1 reports averages and empirical standard deviationsFootnote 3 (in parenthesis) of the estimates over the M = 50 replications and for the various prediction strategies. We considered several other variants but restrict the presentation to three benchmark procedures and some specializations of Eqs. 1a and 1b chosen to reflect the key lessons of our study. Section 5 comments on some extensions.

Table 1 Mean Absolute Percentage Error Estimates (Overview)

The three benchmark predictors for y comprise

  1. a)

    the speed limit \(\hat y_{i}=\text{\texttt{mxsp}}_{i}\),

  2. b)

    the scaled speed limit \(\hat y_{i}=\hat \beta \text{\texttt{mxsp}}_{i}\), wherein \(\hat \beta \) symbolizes a least-squares estimate, and

  3. c)

    a linear model prediction \(\hat y_{i}= \hat c+\hat \gamma _{t(i),f(i)}+\hat \beta _{f(i)}\text{\texttt{mxsp}}_{i}\), wherein t(i), and f(i) once more indicate the respective time interval and the road type (frc). The least-squares estimates \(\hat c\), \(\hat \gamma _{t,f}\), and \(\hat \beta _{f}\) represent an intercept, the daily variation, and the influence of the speed limit, respectively. The mxsp slope coefficient estimates \(\hat \beta _{f}\) are only allowed to differ across the road type (frc) groups {1, 2}, {3}, {4, 5, 6}, and {7, 8}. The daily variation coefficients \(\hat \gamma _{t,f} =\hat {\bar \gamma }_{t,1}\) are identical across all road types, which amounts to (g = 1, F 0 = , F 1 = {1, … , 8}). In addition, the estimation of the daily variation coefficients \(\bar \gamma _{t,1}\) enforces identical coefficient estimates for (night) time intervals between 23:00 and 5:30.

The first two are used as benchmarks in [12]; the third method is the specification advocated therein. The second benchmark b) may be understood as a constant prediction of \(y^{\prime }_{i}=\text{\texttt{sp}}_{i}/\text{\texttt{mxsp}}_{i}\). Its acceptable performance—shown in Table 1—motivated the consideration of the refined ratio model in Eq. 1b.

In addition, we include several variants of the models Eqs. 1a and 1b. All of these specializations allow differences in the effect of the speed limit across road types (frc) via unrestrictedFootnote 4 slope coefficients β f ; Section 5 comments on the restriction β f = β, f = 1, ... , 8. More specifically, we consider non-regularized fitting (λ = 0) of

  1. d)

    Eq. 1a with no daily variation (g = 0, F 0 = {1, .. , 8}),

  2. e)

    Eq. 1a with a single daily variation (g = 1, F 0 = , F 1 = {1, … , 8}),

  3. f)

    Eq. 1a with three groups of identical daily variation (g = 3, F 0 = , F 1 = {1, 2}, F 2 = {3, 4, 5, 8}, F 3 = {6, 7}),

  4. g)

    Eq. 1a with unrestrictedFootnote 5 daily variation matrix Γ (g = 8, F 0 = , F j = {j}, j ≤ 8), and

  5. h)

    Eq. 1b with unrestrictedFootnote 6 daily variation matrix Γ.

The latter two cases g) and h) are fitted with and without regularization. Both estimation strategies are combined with prediction as in Eq. 3a as well as Eq. 3b with ranks r ∈ {1, 2}.

A few words on the selection of model variants are in order. The first two variants d) and e) act as further benchmarks reflecting the simplest variants of Eq. 1a. The grouping in f) showed the best performance among different configurations for the daily variation groups on the Viennese data (on-site). Its success reflects the similar loadings—shown in panel (e) and (f) of Fig. 2—of these groups with the first two signals—shown in panel (a) and (b) of Fig. 2. The most flexible configurations g) and h) show the best overall performance when fitted with regularization. The non-regularized cases help to judge the value of regularization.

5 Results

Table 1 summarizes the main results of our study. In particular, the upper third of Table 1 presents averages of mean absolute percentage error (MAPE) estimates from M = 50 replications of the same out-of-sample prediction exercises for six benchmark procedures. The parenthesized numbers are empirical standard deviations of these mean absolute percentage error estimates.

Here predictions are calculated by Eq. 3a; the use of Eq. 3b is indicated by adding the choice for the rank r to the respective column title. Similarly, an additional label regl. signals regularized least-squares fitting. All numbers are multiplied by 100 to reflect percentages and rounded to at most 4 significant digits. Boldface indicates the best result within the respective row.

The supplementary Tables 234 and 5 follow the same layout and typographical conventions as Table 1. Therein, bias and root mean squared error estimates are averages as in Eq. 4a (or Eq. 4b) but over powers of the prediction errors \(\left (y-\hat y(p\,|\, q)\right )^{s}\) (or \(\left (y^{\prime }-\hat y^{\prime }(p\,|\, q)\right )^{s}\text{\texttt{mxsp}}^{s}\)) with s = 1 (bias) and s = 2 (root mean squared error), respectively. Again, the reported numbers are means and standard deviations over M = 50 replications. The abbreviations ‘V.’, ‘Li.’, and ‘Ly.’ replace ‘Vienna’. ‘Linz’, and ‘Lyon’, respectively.

Table 2 Bias Estimates (Selected Models)
Table 3 Root Mean Squared Error Estimates (Selected Models)
Table 4 Mean Absolute Percentage Error Estimates (Selected Models with Single Slope Coefficient)
Table 5 Root Mean Squared Error Estimates (Selected Models)

The table clearly shows that using the speed limit as the prediction of the actual traveling speed leads to a decisively worse performance than all its competitors. The latter is due to its ignorance of congestion, which is already clearly visible in Fig. 1. This is also seen in Table 2 which contains (averaged) bias estimates for the three benchmark procedures a), b), and c) alongside g) and h) fitted with regularization and using Eq. 3a for prediction. The latter two as well as the method c) advocated by [12] are unbiased for on-site prediction and feature a moderate bias when used for cross-city prediction. The speed limit a) is highly biased in both cases; simply scaling the speed limit b) removes this deficiency to a large extent. However, this simplest method a) exhibits the unique selling point of not requiring any estimation and thus no actual speed measurements. The identical on-site and cross-city performance of a) in Linz and Lyon is an obvious consequence.

The specialization f) of Eq. 1a performs best among the benchmarks shown in the upper third of Table 1. The less flexible c) seemingly outperforms f) in cross-city prediction; however, its superiority is small and lies within the sampling uncertainty. The performance of b), c), d), and e) ranges between that of a) and f). Regarding these methods, note that the cross-city prediction in Linz using b), c), and d) exceeds the respective on-site performance.

Table 3 shows (averaged) root mean squared error estimates for on-site and cross-city prediction in Linz using these methods. The latter measures are calculated from the same prediction errors as used for Table 1 to counter doubts regarding the results shown therein. More specifically, least-squares estimation implicitly optimizes the root mean squared error, and the on-site performance of b), c), and d) in Linz shown in Table 3 surpasses the respective cross-city result.

The second and third part of Table 1 concern the procedures g) and h). In summary, these procedures outperform the simpler alternatives and yield optimal results for both on-site and cross-city prediction when fitted with regularization and using Eq. 3a. Figure 3 shows how the gains in cross-city prediction are distributed over time. Panel (a) visualizes the differences between the (averaged) cross-city mean absolute percentage error estimates for g) together with Eq. 3a and fitted with regularization and the corresponding estimates for c). Panel (b) replaces g) with h). Notable improvements in cross-city performance are spread across the off-peak hours. In contrast, prediction quality remains largely unaltered during peaks.

Fig. 3
figure 3

Panel (a) shows the difference between the (averaged) mean absolute percentage error estimates for procedure g) using Eq. 3a and fitted with regularization and those for c) separately for the 96 time intervals. Panel (b) compares h) (using Eq. 3a and regularized fitting) with c) in the same way

Table 4 delivers (averaged) mean absolute percentage error estimates for cross-city prediction using two variants of g) and h) with restricted slope coefficients. It shows that the cross-city results of g) and h) can be improved in case of Lyon by enforcing an identical speed limit slope coefficient for all road types via β f = β, f = 1, ... , 8. Cross-city prediction for Linz degrades under this constraint.

Similarly, the record of Eq. 3b is mixed. Tables 1 and 4 certify a slightly improved cross-city performance for Lyon and losses in prediction accuracy for Linz. In light of Table 3, one may suspect that gains are hidden by using the mean absolute percentage error. However, the corresponding root mean squared error estimates in Table 5 disprove this suspicion.

A comparison of our main results (averaged MAPEs in Table 1) with competing studies in literature is in order. Moghaddam and Hellinga [14] provide models for predicting freeway travel times based on Bluetooth data. Thus, these authors consider an arguably simpler setting and data of higher quality. They find MAPEs of 13-18 %. Stathopoulos and Karlaftis [19] develop multivariate ARIMA and state space models in a setting similar to ours and obtain MAPEs between 12-20 %. Tulic et al. [22] investigate the Vienna taxi FCD in detail. They also build multivariate autoregressive models for a smaller number of links. In addition, these authors find that the MAPE for a given link varies strongly with the number of measurements on the link based on only a single observed taxi. Their (MAPE) results range from roughly 10 % (for links with almost no measurements based only on one taxi) up to 40 % (for links with almost all measurements being based on only one taxi) and with averages over all links ranging from 20 % to 30 % depending on the time of the day. In summary, the averaged MAPEs in Table 1 (ranging from 21 % to 25 %) are of similar magnitude as those found in related studies.

Finally, partial residual plots [6, Section 3.1.3] derived from the linear formulations showed no indication of a nonlinear effect of the speed limit mxsp. Nonetheless, we experimented with more flexible modeling of the effect of the speed limit mxsp, in particular, polynomial terms as well as more general non-parametric approaches, and orthogonality constraints on the columns of Γ to enforce zero variation in the mean at night time. We found no improvements in the prediction performance when using these extensions and therefore refrain from a detailed discussion.

6 Conclusion & outlook

This paper formulates several models expressing traveling speeds on a road link at a given time in terms of its road type (frc), its speed limit, and day time; see Section 3. Restricting the choice of covariates to time and static map information allows prediction when no measurement data is available. More specifically, we use the model fits derived from Viennese data to obtain predictions for the Austrian city Linz and the French city Lyon; see Section 4. We evaluate these transfer predictions using actual data for both cities and show that using the Viennese fit as a surrogate for a city-specific fit entails merely a moderate loss in prediction accuracy; see Section 5. We conclude that this transfer is a reasonable means to obtain traveling time predictions for cities which lack actual measurement data.

Concerning the model choice, we find that good on-site performance coincides with good transfer performance. We therefore suggest selecting a model based on its on-site prediction performance. In our study, a flexible model allowing distinct daily variation patterns for different road types (frc) together with regularized least-squares fitting dominates all its competitors. The Appendix provides instructions for a numerically efficient implementation of this procedure. In addition, we note that slight reductions in model complexity may further improve the transfer performance. Specifically, we consider equality constraints on the speed limit slope coefficients and a rank reduction of the daily variation coefficient matrix.

Finally, we list four possible directions for further research. Firstly, estimates for cross-city prediction can be obtained from a pooled data set consisting of data from two (or more) cities. Intuitively, these estimates should reflect peculiarities of individual cities to a lower degree than estimates based on data from a single city and therefore be better suited for cross-city prediction. A semi-transfer provides another alternative—applicable when too little data is available for a given city. More specifically, data fusion techniques can be used to “update” the transferred fit based on the available data prior to prediction. Secondly, our study focuses on three cities. Investigating whether our findings generalize to a broader context is clearly important for the intended application. Thirdly, one may use different regularization constant λ j for the daily variation pattern γ j pertaining to different road types j to reflect the different levels of smoothness of the daily variation patterns shown in Fig. 1. Finally, a rank constraint could be added to the estimation of Γ to obtain an even better bias/variance trade-off. Therein, the appropriate rank may either be enforced as a hard constraint or estimated by penalization as in [17].

In conclusion, we can state that for the data sets examined in this paper out-of-sample prediction accuracy amounts to broadly 25 %. The transferal of model fits did not decrease the accuracy for very similar regions with Linz showing 25 % for on-site and cross-city evaluation. The penalty is higher for more distant cities: Lyon shows on-site errors of less than 22 % with transferal accuracy more than 24 %, thus, adding roughly 2.3 percentage points. It remains to be seen whether the more sophisticated models alluded to in the last paragraph can change this picture.