1 Introduction

Travel time distribution or variability in travel time is the most useful indicator to measure the performance and reliability of a transportation system. The properties of this distribution are described by various uncertainties which are derived from supply side, demand side and other external factors of a particular road network. Width in travel time distribution indicates higher uncertainties and lower travel time reliability. The measure of central tendencies of travel time distribution is unable to explain the traveler’s experience. Recently, various empirical travel time reliability studies [4, 11, 17] and Asakura [2] have extensively used travel time distribution as a tool for developing various reliability indices such as Planning Time (95 % travel time), Buffer Time Index and Planning Time Index [14]. All these reliability indices are useful to improve regional transportation planning [10].

When we intend to evaluate the effects of a transport policy on travel time reliability, it is necessary to identify the factors (source of uncertainty) that will affect travel time, and relation between the various sources of travel time. In this section various existing studies related [1] to sources of travel time variation are reviewed. Very few studies have concentrated on quantifying sources of uncertainties making travel time unreliable. Vander loop identified the main causes of unreliability of travel times for Netherlands urban roads. According to his study, 74 % of unreliability in the travel time is mainly due to internal factors of the traffic. The remaining is due to weather (8 %), road works (14 %), accidents (3to12 %) and combination factors (2 %) [18].

The US, Federal Highway Administration (FHWA) has identified seven sources of events which cause travel time variation. Further they have categorized into three main events such as traffic influence events (includes traffic incidents, work zones and weather), traffic demand events (includes fluctuations in normal traffic and special events) and physical highways features (includes traffic control devices and bottle necks) [3]. Ruimin [16], examined travel time variability under the influence of time of day, day of week, weather effect and traffic accident. In that study, the author quantified sources of travel time parameters with the help of multiple linear regressions with two way interaction models. In another study, Florida department of Transportation (Florida DOT) developed empirical travel time variability models such as function of frequency of incidents, work zones and weather conditions. For this, they have considered regression analysis on combination of different scenarios of uncertainty sources [5]. Asakura [2] further categorized the sources of travel time fluctuations in to three factors which are from demand side such as day to day traffic variation, supply side such as road closure due to accidents and external factors such as adverse weather effects and natural disaster. Most of the studies in the literature used deterministic approach to model travel time variation under the influence of various factors from supply side and demand side of the system. Travel Time variation on Hanshin expressway, Kobe route is mainly due to traffic volume, traffic accidents and amount of Rainfall.

The present study is an attempt to model the travel time distribution under various uncertainties. For this, Stochastic Response Surface Method (SRSM) has been adopted. SRSM [6] is an extension of classical Response Surface Method (RSM) to systems with stochastic inputs and outputs. The motivation for considering this model over the traditional Multiple Linear Regression (MLR) and other deterministic approaches is that both of these models fail to map stochastic behavior between response variable and explanatory variable in the system of uncertainty of travel time variation. In particular, two continuous probabilistic random factors were considered in this paper, one is traffic volume and the other is intensity of rain fall.

Archived continuous supersonic vehicle detectors data of Kobe Route on Hanshin expressway network in Japan were considered in this study. Travel time has been estimated for the study corridor by considering time slice method. Traffic incident data was collected for the same study period to model the travel time variation under various uncertainties. The same data has been considered to develop the traditional statistical model such as regression model and stochastic models. The comparative evaluation was made between these modeling approaches.

2 Study area and data collection

2.1 Study area

Data used in this study were collected from a section of Kobe Route, Route Number 3 of Hanshin expressway, Japan. Kobe route extends between Kobe and Osaka city and the length of this route is about 30 km. For the present study a section of the route from Nishinomiya IC to Awaza a total length 14.9 km data was considered for modeling travel time distribution. Figure 1 represents the study area and Kobe Route of Hanshin Expressway.

Fig. 1
figure 1

Study Area of Kobe Route, Route No.3 of Hanshin Expressway

2.2 Data collection

Supersonic vehicle detectors are installed on Hanshin expressway at every 500 m for observing the traffic volume and time occupancy ratio. For this study archived continuous supersonic vehicular detectors data for every 5 min intervals were collected for the entire year 2006. The section travel time at every 500 mts is estimated. After that path travel time for the study area was estimated by considering the time slice method, this has been explained in the next section. This travel time is influenced by various incidents occurred in the study area during the study period. The iincident data of this study area such as traffic accident data, road works, vehicle break down, road cleaning and other traffic related incident data has been collected from Hanshin Expressway Corporation Ltd. for the entire Year 2006. The number of incidents occurred on the study area for the year 2006 is presented in Fig. 2. From Fig. 2, it can be observed that the traffic accidents, vehicle breakdown and road inspection were comparatively more on the study area. Rain fall (mm/hr) data was collected from the official website of the Japan Meteorological Agency (JMA) [8]. Hourly rain fall data of Osaka was considered for the present study area.

Fig. 2
figure 2

Numbers of Incidents Occurred on Kobe Route for the Year 2006

Fig. 3
figure 3

Route travel time estimation by Time slice method

3 Travel time estimation

From vehicle detector data, spot speed is estimated for every 500 m interval and corresponding travel time for the same sections are calibrated by transforming the speed data of the section. Furthermore, path travel time for the study area (14.9 km) is estimated by using the time slice method, which considers the variation of speed over the time by constructing the vehicle trajectory. Travel time obtained from this method is sufficiently close to the actual travel time [2]. Conventionally the travel time of an entire route is calculated simply by accumulating the travel times of each section at a given time. It is expressed in Eq. (1)

t i s s = i = 1 N t i s
(1)

where t i (s) denotes the travel time of section “i” at a given time “s”.

Small sections (500 mts interval) in a route are numbered sequentially towards the downstream direction. This method generates an instantaneous travel time based on the assumption that vehicles instantaneously traverse the route. When traffic condition is stable and travel speed is constant, the travel time can be calculated correctly through this instantaneous method. However, the estimated travel time may not be accurate when traffic flows are unstable. The alternative method of calculating the route travel time is the Time slice method, with which the travel times of each section are accumulated successively with the delay of the section travel time. The route travel time is represented as T(s) and this was explained in Fig. 3. The estimation of travel time by considering time slice method is expressed in Eq. (2). Yoshimura and Suga compared two sets of travel time estimated by the instantaneous method and the time slice method using Automatic Vehicle Identification (AVI) data as true values. They found that the instantaneous method caused large errors at both increased and decreased hours of traffic congestion and the time slice method could follow actual travel time fluctuation without delay[19]. The time slice method is more suitable for offline application rather than online application when the speed varies over time [9] and also provides better results over the instantaneous method. This path travel time is considered as a dependent variable for travel time distribution modeling.

T s = i = 1 N t i s + τ i s
(2)

where τ i (s)denotes the travel time from section 1to section i-1 and written as

τ i s = i = 1 i 1 t i s + τ i s

Travel time distribution of the study area for the entire year (sample size 8760) were plotted and presented in Fig. 4. The probability and cumulative distribution is a visual tool representation of travel time variability over the period. The minimum and maximum travel time for this 14.9 km section varies between 498 and 4,383 s respectively. The mean travel time of the study area is 760 s and the standard deviation of travel time is 346 s.

Fig. 4
figure 4

Travel Time distributions of Study Area

4 Modeling travel time distribution

4.1 Multiple linear regression analysis

Multiple Linear Regression (MLR) analysis is carried to understand the influence of all the incidents on travel time variation. Further to understand the behavior of travel time variation on working days MLR analysis was carried out separately. The estimated MLR model coefficients for the entire year data was shown in Table 1. The basic test of any model estimation is examination of the sign of model coefficients. From Table 1 it can be observed that the sign of estimated coefficients of all the variables are positive for the entire year of data. This indicates that all the incidents have positive contribution towards the travel time variation and this is more logical since travel time increase with occurrence of incidents. Based on t-static values it can be conclude that traffic volume, traffic accidents, vehicle break down, road cleaning, rain fall and other incidents are significant variables contributing to the travel time variation ( t stat value is greater than the critical value of 1.64 at 5 % level of significance). The higher F value (371.53) and corresponding low probability value (p < 0.05) of this model indicates that the model is significant. The corresponding R2 value explains the 24 % of the total variation.

Table 1 MLR estimated coefficients for the study area

Similarly MLR analysis carried out for working days data (249 days) to understand the effect of incidents on travel time variation during these days. From observation of t stat values of traffic volume, traffic accident, rain fall and other incidents on working days have high magnitude of significance in travel time variation (t stat value is greater than the critical value of 1.64 at 5 % level of significance). Road works and road cleaning incidents generally taken place on non working days except during emergency cases on Hanshin Expressway. Analysis of Variance (ANOVA) of this model having high F value (275.20) and with very low probability value (p < 0.005) demonstrates a very high significance for the regression model. The goodness of fit of the model R2 value indicates that 25 % of the total variation is explained by this model. From Table 1, it was concluded that the traffic volume, traffic accidents and rain fall incidents are highly significant for travel time variation on this section of Kobe Route.

Further from these parameters, the continuous random attributes such as traffic volume and rain fall effect was considered in probabilistic analysis for modeling the stochastic behavior of travel time distribution. For this Stochastic Response Surface Method (SRSM) was considered to model the travel time distribution under the uncertainties. The limitation of this model is that, it considers the continuous random variables in the modeling. Before carrying out the SRSM analysis MLR analysis has been carried out for the continuous random variable data considered in the SRSM analysis and the results were presented in Table 2.

Table 2 Model coefficient estimated by multiple regression analysis

Nonlinear regression analysis was also carried out for developing the relation among the Travel Time , Traffic Volume and Rin Fall Intensity parameter. The functional form of the nonlinear models is given in Eq. (3). The model coefficients estimated for individual data of study area are presented in Table 3. Non linear model was found better than the linear model based on R2 value is 0.226 for non linear model whereas this value is 0.22 for linear models As in the case of non linear analysis

Table 3 Model coefficient estimated by non linear analysis
y = β 0 + β 1 TV + β 2 RF + β 3 TV 2 + β 4 RF 2 + β 5 TV RF
(3)

Where TV Traffic Volume (veh/hr) and RF is Rain fall intensity (mm/hr)and β0 to a β5

Further these model coefficients were considered for estimating travel time for the collocation points (Table 6) generated for 2nd order polynomial equations. The following sections discuss the SRSM analysis for working days data.

4.2 Stochastic response surface method

Probabilistic analysis is most widely used method for characterizing uncertainty in physical and social systems, especially when estimates of the probability distributions of uncertain parameters are available. These models can describe uncertainty arising from stochastic disturbances, variability conditions and risk consideration. The main process of probabilistic models comprises of probability encoding of inputs and propagation of uncertainties through models. Probability encoding of inputs involves the determination of the probabilistic distribution of the input parameter and incorporation of random variation. This is accomplished by using statistical estimation technique involves estimating probability distribution from available field data. Figures 5 illustrate the concept of uncertainty propagation of travel time. In this each point of the response surface (calculated output value of travel time) of the model, change in traffic volume and rain fall will be characterized by probability density function (PDF) of these inputs. The methodology for adopting this approach was discussed in the next section.

Fig. 5
figure 5

Schematic Representation of Propagation of Travel Time Uncertainty

Fig. 6
figure 6

Evaluation of SRSM: Probability distribution and Cumulative Probability Distribution

4.2.1 Methodology

Stochastic Response Surface Method (SRSM) [6, 7] is an extension to the classical deterministic response surface method (RSM). RSM is a collection of mathematical and statistical techniques that are useful for the modeling and analysis of problems in which response of interest is influenced by several variables [13]. RSM also quantifies relationship among the measured responses and the input factor. The main difference between RSM and SRSM is the way the input parameter are supplied. SRSM is one of the ideal conventional sampling based method for uncertainty analysis and this is accomplished by approximating both inputs and outputs of the uncertain system through stochastic series of well-behaved standard random variable (srv). The series expansion of the outputs contains coefficients that can be calculated from the results of limited number of model simulations.

The srv’s are selected from a set of independent, identically distributed (iid) unit random variable ( i = 1, 2….n). Where “n” is the number of independent inputs and each ξi having a zero mean and unit variance. The following steps are involved in the application of the SRSM to the uncertain analysis of a model with random inputs and random outputs.

  • Step1 Representation of stochastic model inputs: For each uncertain input, corresponding srv is assigned and the input random variable is expressed in terms of the srv. If the input random variables are mutually independent, the uncertainty in the i-th input variable Xi is expressed as a function of the srv.

    X i = f i ξ i
    (4)
  • Step2 Functional approximation of model output: Each model output is expressed as a series of expansion in terms of srv as a multidimensional hermit polynomial with unknown coefficients. A Second order polynomial approximation is generally recommended in the literature. Also this approximation can be refined further using higher order terms depending on the accuracy needs. In this study second order polynomial function with two independent variable ξ1 andξ2 were considered and the mathematical expression was presented at Eq. (5)

    y = a 0 + a 1 ξ 1 + a 2 ξ 2 + a 3 ξ 1 2 1 + a 4 ξ 2 2 1 + a 5 ξ 1 ξ 2
    (5)
  • Step 3: Estimation of unknown coefficients in functional approximation: The unknown coefficients in Eq. (2) are estimated by equating model outputs with the corresponding polynomial expansions at a set of possible collocation points. Preferably next higher order of functional approximation routes to be considered for the generation of collocation points [15].

  • Step 4: Calculation of the statistical properties of model outputs: The model outputs are estimated followed by the estimation coefficients. The statistical properties of the outputs such as probability density function, moments of “y” can be readily calculated. This can be accomplished by generating large number of the srvs and the calculation of the values of inputs and the outputs from the transformation of Eqs. (4) and (5)

4.2.2 SRSM analysis

Out of 5675 working days sample data, 4180 sample data were incidents free. During this time period travel time varies only due to fluctuation in traffic volume and effect of rain fall. SRSM method was applied to model the travel time distribution due to the effect of these continuous random variable. Table 4 shows the uncertainty ranges of model parameters and sampling strategy considered for transforming the uncertain variables for SRSM model. Also the statistic parameters of input and response variable are presented in Table 4. Goodness of fit test between the observed frequencies (from data ) and the theoretical fitted frequencies was done by considering Kolmogorov-Smirnov test and Chi-square goodness of fit for the two fitted distributions such as lognormal distribution for travel time and exponential distribution for rain fall. The results indicates that at the 5 % level of significance the decision is to reject the null hypothesis, indicates that no difference between empirical and theoretical cumulative distributions. Therefore lognormal distribution and exponential distribution was considered for generating the random data for traffic volume and rainfall respectively in SRSM model.

Table 4 Uncertainty ranges of model parameter and response variable

Second order SRSM model was considered to approximate the response of travel time (Eq. 5). In order to solve for the second order polynomial expansion, the roots of the third order hermit polynomial, + 3 , 3 and zero are used. The points are selected such that each srv takes the value of either 0 or one of the roots of the polynomial. Therefore there are nine possible collocation points they are 0 0 , 3 , 0 , 0 , 3 , 3 , 0 , 0 , 3 , 3 , 3 , 3 , 3 , 3 , 3 and 3 , 3 .

Set of model input points for traffic volume and rainfall at the points were generated by using transformation technique and presented at Table 5. For lognormal distribution exp(μ + σξ1) and for exponential distribution 1 λ log 1 2 + 1 2 erf ξ 2 2 was considered [6] where erf is a error function.

The unknown coefficients in Eq. 5 considered for SRSM model are solved by using singular value decomposition method and the corresponding coefficients (a0, a1…a5) are presented at Table 6. The Eq. 5 is well fit for the points which were considered in Table 5. The highest R2 value (0.99) of this model indicates that this model is significant for the 2nd order polynomial equation. The highest t-statistic values of this model indicate that coefficients of linear term, quadratic term and interaction term is significant.

Table 5 Collocation points for 2nd order polynomial equations and corresponding travel time
Table 6 Model coefficients estimated by SRSM model

Once the coefficients are estimated the travel time distribution can be fully described by random generation of a large number of samples. In this study the 4180 random samples (same size original data) are generated for SRSM analysis. All this procedure was implemented in MATLAB environment [12]. Travel Time estimated by SRSM model and MLR models are compared against with actual travel time is presented at Fig. 6. From this figure it can be observed that SRSM probability distribution is uni-modal (having one maximum at 625 sec), asymmetrical and similarly follows the actual travel time distribution. Whereas travel time distribution obtained by MLR models are bimodal frequency curves having two peaks, one maximum at 625 s and the other maximum at 925 s. Even travel time distribution estimated by MLR model by considering all the uncertainty parameters (Table 1) also follows bimodal frequency. From this it can be conclude that the MLR models are overestimating beyond the average actual travel time (783 s). It can also be concluded from the Fig. 5 that even if more uncertainty parameters are considered for modeling travel time MLR models are unable to follow the actual travel time distribution. Further, from travel time distribution it can be observed that travel time obtained by SRSM model is well distributed between travel time, 542 s to 2,302 s. Whereas MLR models estimated travel time distribution varies between 493 to 1,363 s. From this it can be concluded that MLR models are unable to map the worst case scenarios, this we can observe from tails of the probability distribution of travel time (Fig. 5).

From the above discussion of results, it was observed that SRSM models are capable to analyze the stochastic behavior of uncertain variable and also these models are performs better than the conventional regression model to model travel time distribution. The algebraic expressions in terms of standard random variable (srv) are smooth and continuous could efficiently model the tails of the probability distributions of the outputs (Fig. 5). This explains that the SRSM models are capable to model the worst case scenarios. Further the observable difference between the estimated distribution of SRSM model and actual distribution can be improved by increasing the number of uncertainty parameters in the model.

To validate the distributions obtained by both the models, chi-square non-parametric statistical goodness of fit have been carried out between actual travel time considered as observed frequency and travel time estimated by SRSM and MLR model considered as expected frequency and 30 s travel time intervals have been considered for frequency estimation. From the results it can be concluded that MLR models have higher estimated chi-square value (6415) than the SRSM models (2182). This emphasizes that MLR models have grater discrepancy between actual distribution and estimated distribution than SRSM model.

5 Conclusions

Travel time distribution is the most useful indicator to measure performance of any transportation system and properties of this distribution was influenced by various uncertainties which are derived from supply side, demand side and other external factors of any transportation system. Regression analysis between travel time and various uncertain parameters were considered to develop the functional relationship among them. From the t-statistic value it was observed that the effect of traffic volume, traffic accidents and amount of rain fall influence is quite significant on Hanshin Expressway study area. Further, SRSM models were applied in this study to resolve a probabilistic analysis. The uncertain parameters considered in this analysis are traffic volume and rain fall intensity for modeling travel time distribution. The travel time distribution obtained by SRSM model was compared with regression models and observed that SRSM model is better than the regression model and also following the actual travel time distribution. Further the difference between the estimated distribution by SRSM model and actual distribution may be improved by increasing the number of uncertainty parameters in the model.