Collaborative Data Analysis in Hyperconnected Transportation Systems

  • Mohammad Nozari ZarmehriEmail author
  • Carlos Soares
Conference paper
Part of the IFIP Advances in Information and Communication Technology book series (IFIPAICT, volume 480)


Taxi trip duration affects the efficiency of operation, the satisfaction of drivers, and, mainly, the satisfaction of the customers, therefore, it is an important metric for the taxi companies. Especially, knowing the predicted trip duration beforehand is very useful to allocate taxis to the taxi stands and also finding the best route for different trips. The existence of hyperconnected network can help to collect data from connected taxis in the city environment and use it collaboratively between taxis for a better prediction. As a matter of fact, the existence of high volume of data, for each individual taxi, several models can be generated. Moreover, taking into account the difference between the data collected by taxis, this data can be organized into different levels of hierarchy. However, finding the best level of granularity which leads to the best model for an individual taxi could be computationally expensive. In this paper, the use of metalearning for addressing the problem of selection of the right level of the hierarchy and the right algorithm that generates the model with the best performance for each taxi is proposed. The proposed approach is evaluated by the data collected in the Drive-In project. The results show that metalearning helps the selection of the algorithm with the best performance.


Hyperconnected world Machine learning Metalearning Data mining Intelligent transportation systems Collaborative data analysis 



This research work has received funding from the ECSEL Joint Undertaking, the framework programme for research and innovation horizon 2020 (2014–2020) under grant agreement number 662189-MANTIS-2014-1.


  1. 1.
    T. E. I. U. Limited: The Hyperconnected Economy: Phase 2, Hyperconnected Organizations, London (2015)Google Scholar
  2. 2.
    G. S. M. A. for the SBD: 2025 Every Car Connected: Forecasting the Growth and Opportunity, London (2012)Google Scholar
  3. 3.
    T. W. E. Forum: Connected World Transforming Travel, Transportation and Supply Chains, Geneva (2013)Google Scholar
  4. 4.
    TfL: Performance Report Quarter 1 2012/13, London (2013)Google Scholar
  5. 5.
    Siemens: Pictures of the Future. Livable Megacities – Moscow and St. Petersburg, Russ (2007)Google Scholar
  6. 6.
    Englund, C., Chen, L., Vinel, A., Lin, S.: Future applications of VANETs. In: Campolo, C., Molinaro, A., Scopigno, R. (eds.) Vehicular Ad Hoc Networks, pp. 525–544. Springer International Publishing, Switzerland (2015)CrossRefGoogle Scholar
  7. 7.
    Thill, J.-C.: Geographic information systems for transportation in perspective. Transp. Res. Part C: Emerg. Technol. 8(1), 3–12 (2000)CrossRefGoogle Scholar
  8. 8.
    Hauser, T.A., Scherer, W.T.: Data mining tools for real-time traffic signal decision support & maintenance. In: 2001 IEEE International Conference on Systems, Man, and Cybernetics (2001)Google Scholar
  9. 9.
    Chan, C.-Y., Marco, D.: Traffic monitoring at signal-controlled intersections and data mining for safety applications. In: 2004 Proceedings of the 7th International IEEE Conference on Intelligent Transportation Systems (2004)Google Scholar
  10. 10.
    Salim, F.D., Loke, S.W., Rakotonirainy, A., Srinivasan, B., Krishnaswamy, S.: Collision pattern modeling and real-time collision detection at road intersections. In: 2007 Intelligent Transportation Systems Conference, ITSC 2007. IEEE (2007)Google Scholar
  11. 11.
    Wang, F.-Y.: Parallel control and management for intelligent transportation systems: concepts, architectures, and applications. IEEE Trans. Intell. Transp. Syst. 11(3), 630–638 (2010)CrossRefGoogle Scholar
  12. 12.
    Qureshi, K.N., Abdullah, A.H.: A survey on intelligent transportation systems. Middle-East J. Sci. Res. 15(5), 629–642 (2013)Google Scholar
  13. 13.
    He, W., Lu, T., Yu, C.Q.: A novel traffic flow forecasting method based on the artificial neural networks and intelligent transportation systems data mining. Adv. Mater. Res. 842, 708–711 (2014)CrossRefGoogle Scholar
  14. 14.
    He, Y., Blandin, S., Wynter, L., Trager, B.: Analysis and real-time prediction of local incident impact on transportation networks. In: 2014 IEEE International Conference on Data Mining Workshop (ICDMW) (2014)Google Scholar
  15. 15.
    Zhang, X., Rice, J.A.: Short-term travel time prediction. Transp. Res. Part C: Emerg. Technol. 11(3–4), 187–210 (2003)CrossRefGoogle Scholar
  16. 16.
    Rashed, T., Jurgens, C.: Remote Sensing of Urban and Suburban Areas, vol. 10, pp. 181–192. Springer, Berlin (2010)Google Scholar
  17. 17.
    Lee, U., Gerla, M.: A survey of urban vehicular sensing platforms. Comput. Netw. 54(4), 527–544 (2010)CrossRefzbMATHGoogle Scholar
  18. 18.
    Boban, M., Barros, J., Tonguz, O.: Geometry-based vehicle-to-vehicle channel modeling for large-scale simulation. IEEE Trans. Veh. Technol. 63(9), 4146–4164 (2014)CrossRefGoogle Scholar
  19. 19.
    Zarmehri, M.N., Soares, C.: Improving data mining results by taking advantage of the data warehouse dimensions: a case study in outlier detection. In: Encontro Nacional de Inteligencia Artificial e Computacional, Sao Carlos, Brazil (2014)Google Scholar
  20. 20.
    Zarmehri, M.N., Soares, C.: Using data hierarchies to support the development of personalized data mining models: a case study in error detection in foreign trade transactions. Int. J. Data Warehous. Min. (2016, submitted)Google Scholar
  21. 21.
    Wolpert, D.H., Macready, W.G.: No free lunch theorems for optimization. IEEE Trans. Evol. Comput. 1(1), 67–82 (1997)CrossRefGoogle Scholar
  22. 22.
    Giraud-Carrier, C., Vilalta, R., Brazdil, P.: Introduction to the special issue on meta-learning. Mach. Learn. 54(3), 187–193 (2004)CrossRefGoogle Scholar
  23. 23.
    Brazdil, P., Giraud-carrier, C., Soares, C., Vilalta, R.: Metalearning: Applications to Data Mining, pp. 662–666. Springer, Heidelberg (2009). Sammut, C., Webb, G.I. (eds.)zbMATHGoogle Scholar
  24. 24. DRIVE-IN: Distributed Routing and Infotainment Through Vehicular Inter-Networking (2014)Google Scholar
  25. 25.
    Hearst, M.A., Dumais, S.T., Osman, E., Platt, J., Scholkopf, B.: Support vector machines. IEEE Intell. Syst. Appl. 13(4), 18–28 (1998)CrossRefGoogle Scholar
  26. 26.
    Scholkopf, B., Smola, A.: Support Vector Machines. Encyclopedia of Biostatistics (1998)Google Scholar
  27. 27.
    Steinwart, I., Christmann, A.: Support Vector Machines. Springer Science and Business Media, New York (2008)zbMATHGoogle Scholar
  28. 28.
    Amit, Y., Geman, D.: Shape quantization and recognition with randomized trees. Neural Comput. Mass. Inst. Technol. 9(7), 1545–1588 (1997)CrossRefGoogle Scholar
  29. 29.
    Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)MathSciNetCrossRefzbMATHGoogle Scholar
  30. 30.
    Liaw, A., Wiener, M.: Classification and regression by random forest. R News 2(3), 18–22 (2002)Google Scholar
  31. 31.
    Olshen, L.B.J.F.R., Stone, C.J., et al.: Classification and regression trees. Wadsworth Int. Group 93(99), 101 (1984)MathSciNetzbMATHGoogle Scholar
  32. 32.
    Safavian, S.R., Landgrebe, D.: A survey of decision tree classifier methodology. IEEE Trans. Syst. Man Cybern. 21(3), 660–674 (1991)MathSciNetCrossRefGoogle Scholar
  33. 33.
    Ripley, B.: Tree: classification and regression trees (2014)Google Scholar
  34. 34.
    Seber, G.A.F., Lee, A.J.: Linear Regression Analysis, vol. 936. Wiley, London (2012)zbMATHGoogle Scholar
  35. 35.
    Montgomery, D.C., Peck, E.A., Vining, G.G.: Introduction to Linear Regression Analysis, vol. 821. Wiley, London (2012)zbMATHGoogle Scholar
  36. 36.
    Zarmehri, M.N., Soares, C.: Using metalearning for prediction of taxi trip duration using different granularity levels. In: Fromont, E., De Bie, T., van Leeuwen, M. (eds.) IDA 2015. LNCS, vol. 9385, pp. 205–216. Springer, Heidelberg (2015). doi: 10.1007/978-3-319-24465-5_18 CrossRefGoogle Scholar
  37. 37.
    Zambrano-Bigiarini, M.: hydroGOF: goodness-of-fit functions for comparison of simulated and observed hydrological time series (2014)Google Scholar
  38. 38.
    R. C. Team: R: A Language and Environment for Statistical Computing, Vienna (2015)Google Scholar

Copyright information

© IFIP International Federation for Information Processing 2016

Authors and Affiliations

  1. 1.INESC TEC, Faculdade de EngenhariaUniversidade do Porto (FEUP)PortoPortugal

Personalised recommendations