Skip to main content
Log in

Using machine learning and big data approaches to predict travel time based on historical and real-time data from Taiwan electronic toll collection

  • Focus
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

As the technology in automation and computation advances, traffic data can be easily collected from multiple sources, such as sensors and surveillance cameras. To extract value from the huge volumes of available data requires the capability to process and extract patterns in large datasets. In this paper, a machine learning method embedded within a big data analytics platform is constructed by using random forests method and Apache Hadoop to predict highway travel time based on data collected from highway electronic toll collection in Taiwan. Various prediction models are then developed for highway travel time based on historical and real-time data to provide drivers with estimated and adjusted travel time information.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  • Breiman L (2001a) Bagging predictors. Manuf Neth Mach Learn 24:123–140

    MATH  Google Scholar 

  • Breiman L (2001b) Random forests. Manuf Neth Mach Learn 45:5–32

    Article  MATH  Google Scholar 

  • Chen M, Mao S, Liu Y (2014) Big data: a survey. Mob Netw Appl 19:171–209

    Article  Google Scholar 

  • Chen FH, Howard H (2016) An alternative model for the analysis of detecting electronic industries earnings management using stepwise regression, random forest, and decision tree. Soft Comput 20:1945–1960

    Article  Google Scholar 

  • Chien SI-J, Kuchipudi CM (2003) Dynamic travel time prediction with real-time and historic data. J Transp Eng 129(6):608–616

    Article  Google Scholar 

  • Cunha J, Silva C, Antunes M (2015) Health Twitter Big Bata Management with Hadoop Framework. Proc Comput Sci 64:425–431

    Article  Google Scholar 

  • Dean J, Ghemawat S (2008) MapReduce: simplified data processing on large clusters. Commun ACM 51(1):107–113

    Article  Google Scholar 

  • Fei X, Lu C-C, Lui K (2011) A bayesian dynamic linear model approach for real-time short-term freeway travel time prediction. Transp Res Part C 19:1306–1318

    Article  Google Scholar 

  • Gal G, Mandelbaum A, Schnitzler F, Senderovich A, Weidlich M (2017) Traveling time prediction in scheduled transportation with journey segments. Inf Syst 64:266–280

    Article  Google Scholar 

  • Greenhalgh J, Mirmehdi M (2012) Traffic sign recognition using MSER and random forests. In: Proceedings of the \(20{\rm th}\) European signal processing conference

  • Harris JR, Grunsky EC (2015) Predictive lithological mapping of Canada’s north using random forest classification applied to geophysical and geochemical data. Comput Geosci 80:9–25

    Article  Google Scholar 

  • Innamaa S (2005) Short-term prediction of travel time using neural networks on an interurban highway. Transportation 32:649–669

    Article  Google Scholar 

  • Jain E, Jain S (2014) Categorizing Twitter Users on the basis of their interests using Hadoop/Mahout Platform. In: Proceedings of the 9th international conference on industrial and information system

  • Joshi A, Monnier C, Betke M, Sclaroff S (2017) Comparing random forest approaches to segmenting and classifying gestures. Image Vision Comput 58:86–95

    Article  Google Scholar 

  • Kalambe YS, Pratiba D, Shah P (2015) Big data mining tools for unstructured data: a review. Int J Innov Technol Res 3(2):2012–2017

    Google Scholar 

  • Khosravi A, Mazloumi E, Nahavandi S, Creighton D, van Lint JWC (2011) A genetic algorithm-based method for improving quality of travel time prediction intervals. Transp Res Part C 19:1364–1376

    Article  Google Scholar 

  • Li CS, Chen MC (2013) Identifying important variables for predicting travel time of freeway with non-recurrent congestion with neural networks. Neural Comput Appl 23:1611–1629

    Article  Google Scholar 

  • Li CS, Chen MC (2014) A data mining based approach for travel time prediction in freeway with non-recurrent congestion. Neurocomputing 133:74–83

    Article  Google Scholar 

  • Mistry P, Neagu D, Trundle PR, Vessey JD (2016) Using random forest and decision tree models for a new vehicle prediction approach in computational toxicology. Soft Comput 20:2967–2979

    Article  Google Scholar 

  • Qiao W, Haghani A, Shao C-F, Lui J (2016) Freeway path travel time prediction based on heterogeneous traffic data through nonparametric model. J Intell Transp Syst 20(5):438–448

    Article  Google Scholar 

  • Rio SD, Lopez V, Benitez JM, Herrera F (2014) On the use of MapReduce for imbalanced big data using Random Forest. Inf Sci 285:112–137

    Article  Google Scholar 

  • Singh K, Guntuku SC, Thakur K, Hota C (2014) Big data analytics framework for peer-to-peer botnet detection using random forests. Inf Sci 278:488–497

    Article  Google Scholar 

  • van Lint JWC (2006) Reliable real-time framework for short-term freeway travel time prediction. J Transp Eng 132(12):921–932

    Article  Google Scholar 

  • Vlahogianni EI, Karlaftis MG, Golias JC (2014) Short-term traffic forecasting: where we are and where we’re going. Transp Res Part C 43:3–19

    Article  Google Scholar 

  • Wu C-H, Ho J-M, Lee DT (2004) Travel-time prediction with support vector regression. IEEE Trans Intell Transp Syst 5(4):276–281

    Article  Google Scholar 

  • Xu Y, Zhang Q, Wang L (2016) Metric forests based on Gaussian mixture model for visual image classification. Soft Comput. doi:10.1007/s00500-016-2350-4

  • Yildirimoglu M, Geroliminis N (2013) Experienced travel time prediction for congested highways. Transp Res Part B 53:45–63

    Article  Google Scholar 

  • Zhang X, Rice JA (2003) Short-term travel time prediction. Transp Res Part C 11:187–210

    Article  Google Scholar 

Download references

Acknowledgements

This study was partially funded by the Ministry of Science and Technology (Taiwan) Grant: MOST 105-2221-E-027-052 -MY3.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shu-Kai S. Fan.

Ethics declarations

Conflict of interest

All the authors of this paper declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Communicated by Y. Ni.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fan, SK.S., Su, CJ., Nien, HT. et al. Using machine learning and big data approaches to predict travel time based on historical and real-time data from Taiwan electronic toll collection. Soft Comput 22, 5707–5718 (2018). https://doi.org/10.1007/s00500-017-2610-y

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-017-2610-y

Keywords

Navigation