Skip to main content
Log in

Temporal transferability: trade-off between data newness and the number of observations for forecasting travel demand

  • Published:
Transportation Aims and scope Submit manuscript

Abstract

Recent and large amounts of data are crucial for forecasting travel demand. However, in some cases, an older time point may have more data than a more recent time point. A trade-off between older data with a large number of observations and recent data with a smaller number of observations has not been investigated in the context of temporal transferability. In this paper, this trade-off is examined in the context of journey-to-work mode choice behaviours by utilising repeated cross-sectional data collected in Nagoya, Japan. Models estimated utilising different numbers of observations (ranging from 50 to 10,000) obtained at different time points (1971, 1981, and 1991) are applied to the forecasting of behaviours for 2001. Bootstrapping provides insights with statistical meaning. One finding is that the minimum number of observations from a recent time point that is required to produce a forecast statistically significantly better than that produced by older data with a larger number of observations is surprisingly stable, even when the number of observations from the older time point varies considerably. For example, 300–500 stable observations from 1981 produced forecasts that were statistically significantly better than that produced by 500–10,000 wide-ranging observations from 1971. Analysing the trade-off can help determine an efficient survey interval and sample size in an era of declining budgets for travel surveys.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. An extension to include the third possibility is discussed in Sanko (2015, 2016), where the author used the same methodology proposed in the present study to compare the second and third possibilities.

  2. The data utilised in this study is collected on weekdays in the autumn each year. Models estimated with older data (collected last winter) might be more transferable to this winter than models estimated with more recent data (collected this summer). However, the present paper does not address this.

  3. The author benefited from Sikder’s (2013) explanations of the four levels of hierarchy that must be considered when analysing transferability. (Sikder analysed spatial transferability, but the present author believes that his explanations are still valid for temporal transferability.) The top three levels are (i)–(iii) in the main text. The main interest of this paper is (iv), model parameter estimates (e.g., transferability of coefficients of explanatory variables and other parameters such as elasticities and value of time measures). The two dimensions of data newness and the number of observations first impact the model parameter estimates and then the forecasting performance. Also note that the present study focuses more on survey than model estimation. (i)–(iii) are issues in the estimation stage and are controllable by researchers after the survey, while data newness and number of observations are issues in the data collection stage.

  4. Anonymous reviewers expressed concern about the impact of the model specification on the results of the present study. Model specifications reportedly affect temporal transferability, and most studies have shown that models with more explanatory variables are more temporally transferable (Badoe and Miller 1995b; Fox et al. 2014; Karasmaa and Pursula 1997; Train 1979). However, there is also a chance of overfitting (Badoe and Miller 1995a). The present study utilises the same model specification throughout the analysis, which means that sensitivity to different model specifications is less of a concern. Model specifications which produce better temporal transferability might exist. However, if these specifications were able to improve to the same extent both the models with older data and the models with more recent data, then the impact of the model specifications would be cancelled out. If the specifications produced different degrees of improvement, then the model specifications would affect the results of the study. This is a topic for future study.

  5. This relationship often is utilised to find a minimum number of observations to obtain a desired level of accuracy.

  6. These two studies have limited implications for the following reasons. (1) Two time points are inadequate for data use. Data must be prepared from at least three time points, and data from one time point must be used solely for validation. (2) The results might be coincidental, since these two studies randomly, but only once, chose a small number of observations. Karasmaa and Pursula (1997) repeated random sampling 60 times from 1998 datasets and estimated 60 models. While they examined the stability of the estimated parameters, the 60 sets of estimated parameters were not applied to forecasting.

  7. Parameters from older data are estimated in a manner that maximises Eq. (1) for the older data, not the 2001 dataset. However, more transferable models produce larger log-likelihood values in Eq. (1) when the estimated parameters are applied to the 2001 dataset.

  8. Forecasting errors often are divided into input errors and model errors (see De Jong et al. 2007 and Zhao and Kockelman 2002). However, this study is free from input errors since the estimated parameters are applied to the actual 2001 dataset (i.e., the explanatory variables in the 2001 dataset are not forecast).

  9. Sikder (2013) questioned the use of point parameter estimates and repeated random drawings from parameter estimates and their covariance matrix. Since he used a larger number of observations for modelling, the point estimates are assumed to be stable. However, this does not apply to the present study, and the author does not take this approach.

  10. Parameter values are restricted to estimated values when calculating L (y 2, n 2, b) and L (y 1, n 1, b). The likelihood ratio test cannot be used to compare two restricted models. Therefore, the author takes the following approach.

  11. Anonymous reviewers expressed concerns that the author’s model specification included too small a number of explanatory variables. One reviewer thought that alternative-specific constants played a greater role in expressing factors not explained by explanatory variables, since explanatory variables might express only limited parts of the behaviours. The reviewer suggested including more explanatory variables so that the alternative-specific constants play a smaller role, and then comparing the results. However, since the author had already chosen the best model specification, another specification was not tried. Moreover, as noted in footnote 4, the impact of the model specification is less of a concern. The reviewer also suggested updating the constants with recent data and evaluating whether or not the model transferability of the more recent data is better from the impact of the constants or from both the constants and the parameters related to the explanatory variables. However, the author did not update the constants. This issue is addressed elsewhere (Sanko 2015, 2016). The reviewer mentioned that the impact of alternative-specific constants is greater when the modal shares change significantly. It is good to compare combinations of older and more recent time points, where the modal shares change to different degrees (see footnote 15).

  12. Sanko et al. (2013) estimated commuting mode choice models between car and public transportation for the Nagoya metropolitan area and found that the estimate for the car cost parameter was not significant. Moreover, the public transportation cost parameter was not included because it had the wrong sign. This empirically justifies the author’s approach of not considering the travel cost.

  13. Quantile–Quantile plots (not presented in this paper) also suggested that the x b is normally distributed.

  14. An example: the requirement of having at least one bus user. The bus share in 1991 was 5 %. If 50 observations are chosen from the 1991 dataset, then an average of 2.5 bus users will be included. However, some of the 200 bootstrap repetitions include no bus users; these are excluded when calculating the x b ’s. Therefore, the average number of bus users in the remaining repetitions is greater than 2.5, and the samples included for calculating the x b ’s are less representative of the entire set of 10,000 commuting trips.

  15. One anonymous reviewer questioned the role of alternative-specific constants when the modal shares change significantly. Fewer observations from the more recent time point were required to reject the null hypothesis for the 1971/1981 and 1971/1991 datasets than for the 1981/1991 dataset. Note that the former two cases include the 1971–1981 period that saw substantial share changes.

References

  • Badoe, D.A., Miller, E.J.: Analysis of the temporal transferability of disaggregate work trip mode choice models. Transp. Res. Rec. 1493, 1–11 (1995a)

    Google Scholar 

  • Badoe, D.A., Miller, E.J.: Comparison of alternative methods for updating disaggregate logit mode choice models. Transp. Res. Rec. 1493, 90–100 (1995b)

    Google Scholar 

  • Ben-Akiva, M., Lerman, S.R.: Discrete choice analysis: theory and application to travel demand. MIT Press, Cambridge (1985)

    Google Scholar 

  • de Jong, G., Daly, A., Pieters, M., Miller, S., Plasmeijer, R., Hofman, F.: Uncertainty in traffic forecasts: literature review and new results for The Netherlands. Transportation 34(4), 375–395 (2007)

    Article  Google Scholar 

  • Dissanayake, D., Kurauchi, S., Morikawa, T., Ohashi, S.: Inter-regional and inter-temporal analysis of travel behaviour for Asian metropolitan cities: case studies of Bangkok, Kuala Lumpur, Manila, and Nagoya. Transp. Policy 19(1), 36–46 (2012)

    Article  Google Scholar 

  • Duffus, L.N., Alfa, A.S., Soliman, A.H.: The reliability of using the gravity model for forecasting trip distribution. Transportation 14(3), 175–192 (1987)

    Article  Google Scholar 

  • Efron, B., Tibshirani, R.J.: An introduction to the bootstrap. Chapman & Hall, New York (1993)

    Book  Google Scholar 

  • Elmi, A.M., Badoe, D.A., Miller, E.J.: Transferability analysis of work-trip-distribution models. Transp. Res. Rec. 1676, 169–176 (1999)

    Article  Google Scholar 

  • Fox, J., Daly, A., Hess, S., Miller, E.: Temporal transferability of models of mode-destination choice for the Greater Toronto and Hamilton Area. J. Transp. Land Use 7(2), 41–62 (2014)

    Article  Google Scholar 

  • Hensher, D.A., Rose, J.M., Greene, W.H.: Applied choice analysis: a primer. Cambridge University Press, Cambridge (2005)

    Book  Google Scholar 

  • Karasmaa, N., Pursula, M.: Empirical studies of transferability of Helsinki metropolitan area travel forecasting models. Transp. Res. Rec. 1607, 38–44 (1997)

    Article  Google Scholar 

  • Parody, T.E.: Analysis of predictive qualities of disaggregate modal-choice models. Transp. Res. Rec. 637, 51–57 (1977)

    Google Scholar 

  • Sanko, N.: Travel demand forecasts by using repeated cross-sectional data: attempt to express parameters as functions of gross domestic product per capita. Compendium of papers of the 93rd Annual Meeting of the Transportation Research Board. Washington, DC (2014a)

  • Sanko, N.: Travel demand forecasts improved by using cross-sectional data from multiple time points. Transportation 41(4), 673–695 (2014b)

    Article  Google Scholar 

  • Sanko, N.: Should small samples from recent time point be used with older data? applicability of updating models by transfer scaling. Presented at the hEART 2015—4th symposium of the European Association for Research in Transportation, Copenhagen (2015)

  • Sanko, N.: Criteria for selecting model updating methods for better temporal transferability. Compendium of Papers of the 95th Annual Meeting of the Transportation Research Board, Washington, DC (2016)

  • Sanko, N., Dissanayake, D., Kurauchi, S., Maesoba, H., Yamamoto, T., Morikawa, T.: Inter-temporal analysis of household car and motorcycle ownership behaviors: the case in the Nagoya metropolitan area of Japan, 1981–2001. IATSS Res. 33(2), 39–53 (2009)

    Article  Google Scholar 

  • Sanko, N., Morikawa, T., Kurauchi, S.: Mode choice models’ ability to express intention to change travel behaviour considering non-compensatory rules and latent variables. IATSS Res. 36(2), 129–138 (2013)

    Article  Google Scholar 

  • Sikder, S.: Spatial transferability of activity-based travel forecasting models. Ph.D. Dissertation, University of South Florida, Tampa (2013)

  • Train, K.E.: A comparison of the predictive ability of mode choice models with various levels of complexity. Transp. Res. Part A 13(1), 11–16 (1979)

    Article  Google Scholar 

  • Zhao, Y., Kockelman, K.M.: The propagation of uncertainty through travel demand models: an exploratory analysis. Ann. Reg. Sci. 36(1), 145–163 (2002)

    Article  Google Scholar 

Download references

Acknowledgments

This work was supported by JSPS KAKENHI Grant Numbers 25380564 and 16K03931. The author acknowledges the use of data provided by the Chubu Regional Bureau, Japan’s Ministry of Land, Infrastructure, Transport and Tourism, and the NUTREND (Nagoya University TRansport and ENvironment Dynamics) Research Group. This paper is based on presentations given at the 94th Annual Meeting of the Transportation Research Board in Washington, D.C., U.S.A., in January 2015 and the 50th Infrastructure Planning Conference of the Japan Society of Civil Engineers, Tottori, Japan, in November 2014. Comments from anonymous reviewers were appreciated.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nobuhiro Sanko.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sanko, N. Temporal transferability: trade-off between data newness and the number of observations for forecasting travel demand. Transportation 44, 1403–1420 (2017). https://doi.org/10.1007/s11116-016-9707-5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11116-016-9707-5

Keywords

Navigation