A Thorough Review of Big Data Sources and Sets Used in Transportation Research

  • Maria Karatsoli
  • Eftihia Nathanail
Conference paper
Part of the Lecture Notes in Networks and Systems book series (LNNS, volume 36)


The development of Information and Communications Technology (ICT) and the Internet provide Intelligent Transport Systems (ITS) with a huge amount of real-time data. These data are the so-called “Big Data” which can be collected, interpreted, managed and analyzed in a proper way in order to improve the knowledge around the transport system. The use of these technologies has greatly enhanced the efficiency and user friendliness of ITS, providing significant economic and social impacts, contributing positively to the management of sustainable mobility.

In this paper, different sources of big data that have been used in ITS are presented, while their advantages and limitations are further discussed. Analytically, big data sources that have been used within the last 10 years are identified. Then, a review of current applications is done, in order to disclose the most used and proper data source per case.

Aim of the present study is to improve the knowledge around the usage of big data in transport planning and to contribute to the better support of ITS, by providing a roadmap to decision makers for big data collection methods.


Data collection Intelligent Transport Systems Information and Communications Technology Big data classification Traffic information Real-time data 



This work has been supported by the ALLIANCE project ( and has been funded within the European Commission’s H2020 Programme under contract number 692426. This paper expresses the opinions of the authors and not necessarily those of the European Commission. The European Commission is not liable for any use that may be made of the information contained in this paper.


  1. 1.
    Amin, S., Andrews, S., Apte, S., Arnold, J., Ban, J., Benko, M., Bayen, R.M., Chiou, B., Claudel, C., et al.: Mobile century using GPS mobile phones as traffic sensors: a field experiment, pp. 16–20 (2008)Google Scholar
  2. 2.
    Anda, C., Fourie, P., Erath, A.: Transport modelling in the age of big data. In: Future Cities Laboratory (2016)Google Scholar
  3. 3.
    Artikis, A., et al.: Self-Adaptive Event Recognition for Intelligent Transport Management, pp. 319–325 (2013)Google Scholar
  4. 4.
    Arun, K., Jabasheela, L.: Big data: review, classification and analysis survey. Int. J. Innovative Res. Inf. Secur. (IJIRIS) 1(3), 17–23 (2014)Google Scholar
  5. 5.
    Bagchi, M., White, P.: The potential of public transport smart card data. Transp. Policy 12(5), 464–474 (2005)CrossRefGoogle Scholar
  6. 6.
    Barrow, K.: Big Data predicts train delays before they occur. Accessed 11 Aug 2017
  7. 7.
    Bekhor, S., Cohen, Y., Solomon, S.: Evaluating long-distance travel patterns in Israel by tracking cellular phone positions. J. Adv. Transp. 47, 435–446 (2013)CrossRefGoogle Scholar
  8. 8.
    Bertrand, K.Z., Bialik, M., Virdee, K., Gros, A, Bar-Yam, Y.: Sentiment in New York City: a high resolution spatial and temporal view, New England complex systems institute, Cambridge, United States (2013).
  9. 9.
    Biem, A., Bouillet, E., Feng, H., Ranganathan, A., Riabov, A., Verscheure, O., Koutsopoulos, H., Moran, C.: IBM InfoSphere streams for scalable, real-time, intelligent transportation services. In: SIGMOD 2010, 6–11 June, Indianapolis, Indiana, USA (2010)Google Scholar
  10. 10.
    Calabrese, F., Diao, M., Di Lorenzo, G., Ferreira Jr., J., Ratti, C.: Understanding individual mobility patterns from urban sensing data: a mobile phone trace example. Transp. Res. Part C Emerg. Technol. 26, 301–313 (2013)CrossRefGoogle Scholar
  11. 11.
    Calabrese, F., Lorenzo, G.D., Liu, L., Ratti, C.: Estimating origin-destination flows using mobile phone location data. IEEE Pervasive Comput. 10(4), 36–44 (2011). ISSN 1536-1268CrossRefGoogle Scholar
  12. 12.
    Castro, P., Zhang, D., Li, S.: Urban traffic modelling and prediction using large scale taxi GPS traces. In: Kay, J., Lukowicz, P., Tokuda, H., Olivier, P., Krüger, A. (eds.) Pervasive Computing, pp. 57–72, Berlin, Heidelberg (2012)Google Scholar
  13. 13.
    Chandio, A.A., Tziritas, N., Xu, C.-Z.: Big-data processing techniques and their challenges in transport domain (2015)Google Scholar
  14. 14.
    Chandrasekar, P.: Big data and transport modeling: opportunities and challenges. Int. J. Appl. Eng. Res. 10(17), 38038–38044 (2015). ISSN 0973-4562Google Scholar
  15. 15.
    Chao, C., Daqing, Z., Zhi-Hua, Z., Nan, L., Atmaca, T., Shijian, L.: B-Planner: night bus route planning using large-scale taxi GPS traces. In: 2013 IEEE International Conference on Pervasive Computing and Communications (PerCom) (2013)Google Scholar
  16. 16.
    Cheng, Z., Caverlee, J., Lee, K., Sui, D.: Exploring millions of footprints in location sharing services. In: Fifth International Association for the Advancement of Artificial Intelligence Conference on Weblogs and Social Media, Barcelona, Spain (2011)Google Scholar
  17. 17.
    Christian, M., Schneider, V.B., Couronne, T., Smoreda, Z., Gonzalez, M.C.: Unraveling daily human mobility motifs. J. R. Soc. Interface 10, 20130246 (2013)Google Scholar
  18. 18.
    Cici, B., Markopoulou, A., Frias-Martinez, E., Laoutaris, N.: Assessing the potential of ride-sharing using mobile and social data: a tale of four cities. In: Proceedings of the 2014 ACM International Joint Conference on Pervasive and Ubiquitous Computing, Seattle, Washington, 2632055, pp. 201–211. ACM (2014)Google Scholar
  19. 19.
    De Mauro, A., Greco, M., Grimaldi, M.: A formal definition of big data based on its essential features. Libr. Rev. 65(3), 122–135 (2016)CrossRefGoogle Scholar
  20. 20.
    Demchenko, Y., Laat, C.D., Membrey, P.: Defining architecture components of the big data ecosystem. In: Proceedings of International Conference Collaboration Technologies and Systems (CTS 2014), pp. 104–112 (2014)Google Scholar
  21. 21.
    Dewulf, B., Neutens, T., Vanlommel, M., Logghe, S., De Maeyer, P., Witlox, F.: Examining commuting patterns using floating car data and circular statistics: exploring the use of new methods and visualizations to study travel times. J. Transp. Geogr. 48, 41–51 (2015)CrossRefGoogle Scholar
  22. 22.
    Digital Bonanza – Cover Story: Binghamton Research Magazine, Winter Issue, pp. 12–19 (2014)Google Scholar
  23. 23.
    Eggermond, M., Chen, H., Erath, A., Cebrian, M.: Investigating the potential of social network data for transport demand models. In: Transportation Research Board 95th Annual Meeting, United States (2015)Google Scholar
  24. 24.
    Emani, C.K., Cullot, N., Nicolle, C.: Understandable big data: a survey. Comput. Sci. Rev. 17, 70–81 (2015)MathSciNetCrossRefGoogle Scholar
  25. 25.
    Eom, J., Song, J., Moon, D.-S.: Analysis of public transit service performance using transit smart card data in Seoul. KSCE J. Civ. Eng. 19, 1–8 (2015)CrossRefGoogle Scholar
  26. 26.
    Furletti, B., Gabrielli, L., Renso, C., Rinzivillo, S.: Analysis of GSM calls data for understanding user mobility behavior. In: IEEE International Conference on Big Data, United States (2013)Google Scholar
  27. 27.
    Ge, Y., et al.: An energy-efficient mobile recommender system. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2010), p. 899. ACM Press, New York (2010)Google Scholar
  28. 28.
    Gokasar, I., Simsek, K., Ozbay, K.: Using big data of automated fare collection system for analysis and improvement of BRT-Bus rapid transit line in Istanbul. In: Transportation Research Board 94th Annual Meeting, United States (2014)Google Scholar
  29. 29.
    Gonzalez, M.C., Hidalgo, C.A., Barabasi, A.-L.: Understanding individual human mobility patterns. Nature 453(7196), 779–782 (2008)CrossRefGoogle Scholar
  30. 30.
    He, K., Wang, J., Deng, L., Wang, P.: Congestion avoidance routing in urban rail transit networks. In: 2014. IEEE 17th International Conference on Intelligent Transportation Systems (ITSC), pp. 200–205. IEEE (2014)Google Scholar
  31. 31.
    Hood, J., Sall, E., Charlton, B.: A GPS-based bicycle route choice model for San Francisco, California. Transp. Lett. 3(1), 63–75 (2011)CrossRefGoogle Scholar
  32. 32.
    IMDA Infocom Media Development Authority: Smart Nation big on Big Data 14. Accessed 23 July 2017
  33. 33.
    Iqbal, M.S., Choudhury, C.F., Wang, P., González, M.C.: Development of origin–destination matrices using mobile phone call data. Transp. Res. Part C Emerg. Technol. 40, 63–74 (2014)CrossRefGoogle Scholar
  34. 34.
  35. 35.
    Jagadish, H.V., Gehrke, J., Labrinidis, A., Papakonstantinou, Y., Patel, J.M., Ramakrishnan, R., Shahabi, C.: Big data and its technical challenges. Commun. ACM 7(57), 86–94 (2014)CrossRefGoogle Scholar
  36. 36.
    Ju, G., Cheng, M., Xiao, M., Xu, J., Pan, K., Wang, X., Shi, F.: Smart transportation between three phases through a stimulus-responsive functionally cooperating device. Adv. Mater. 25(21), 2915–2919 (2015)CrossRefGoogle Scholar
  37. 37.
    Kemp, G., Vargas-Solar, G., Da Silva, C.F., Ghodous, P., Collet, C.: Aggregating and managing big real-time data in the cloud: application to intelligent transport for smart cities. In: Proceedings of the 1st International Conference on Vehicle Technology and Intelligent Transport Systems, pp. 107–112, Lisbon, Portugal (2015)Google Scholar
  38. 38.
    Lin, J., Ryaboy, D.: Scaling big data mining infrastructure: the Twitter experience. ACM SIGKDD Explor. Newslett. 14(2), 6 (2013)CrossRefGoogle Scholar
  39. 39.
    Long, Y., Zhang, Y., Cui, C.: Identifying commuting pattern of beijing using bus smart card data. J. Geogr. Sci. 67, 1339–1352 (2012)Google Scholar
  40. 40.
    Long, Y., Han, H., Tu, Y., Shu, X.: Evaluating the effectiveness of urban growth boundaries using human mobility and activity records. Cities 46, 76–84 (2015)CrossRefGoogle Scholar
  41. 41.
    Ma, X., Wu, Y.J., Wang, Y., Chen, F., Liu, J.: Mining smart card data for transit riders’ travel patterns. Transp. Res. Part C Emerg. Technol. 36, 1–12 (2013)Google Scholar
  42. 42.
    Møller-Jensen, L., Kofie, R.Y., Allotey, A.N.: Measuring accessibility and congestion in Accra. Norsk Geografisk Tidsskrift-Norwegian. Geogr. 66(1), 52–60 (2012)CrossRefGoogle Scholar
  43. 43.
    Morstatter, F., Pfeffer, J., Liu, H., Carley, K.M.: Is the sample good enough? Comparing data from Twitter’s streaming API with Twitter’s firehose. In: ICWSM 2013, June 21; cs. SI (2013)Google Scholar
  44. 44.
    Munizaga, A.N.: Using smart card and GPS data for policy and planning: the case of Transantiago. Res. Transp. Econ. 59, 242–249 (2016)CrossRefGoogle Scholar
  45. 45.
    Munizaga, M., Palma, C.: Estimation of a disaggregate multimodal public transport origin-destination matrix from passive smart card data from Santiago, Chile. Transp. Res. Part C Emerg. Technol. 24, 9–18 (2012)CrossRefGoogle Scholar
  46. 46.
    Network Rail: Asset Management Services (2013)Google Scholar
  47. 47.
    Noulas, A., Mascolo, C.: Exploiting foursquare and cellular data to infer user activity in urban environments. In: 2013 IEEE 14th International Conference on Paper presented at the Mobile Data Management (MDM), vol. 1, pp. 167–176 (2013)Google Scholar
  48. 48.
    Owen, A., Levinson, D.M.: Modeling the commute mode share of transit using continuous accessibility to jobs. Transp. Res. Part A Policy Pract. 74, 110–122 (2015)CrossRefGoogle Scholar
  49. 49.
    Pan, B., Zheng, Y., Wilkie, D., Shahabi, C.: Crowd sensing of traffic anomalies based on human mobility and social media. In: Proceedings of the 21st ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, pp. 344–353. ACM (2013)Google Scholar
  50. 50.
    Pang, L.X., Chawla, S., Liu, W., Zheng, Y.: On detection of emerging anomalous traffic patterns using GPS data. Data Knowl. Eng. 87, 357–373 (2013)Google Scholar
  51. 51.
    Papacharalampous, A.E.: Aggregated GSM data in origin destination studies. Masters’ thesis. Technical University of Delft, Netherlands (2014)Google Scholar
  52. 52.
    Pelletier, M., Trépanier, M., Morency, C.: Smart card data use in public transit: a literature review. Transp. Res. Part C Emerg. Technol. 19(4), 557–568 (2011)CrossRefGoogle Scholar
  53. 53.
    Phithakkitnukoon, S., Horanont, T., Di Lorenzo, G., Shibasaki, R., Ratti, C.: Activity aware map: identifying human daily activity pattern using mobile phone data. In: Human Behavior Understanding, pp. 14–25. Springer (2010)Google Scholar
  54. 54.
    Romph, E.: Using big data in transport modelling. Data Model. Magaz. 10, Summer Issue (2013)Google Scholar
  55. 55.
    Roth, C., Kang, S.M., Batty, M., Barthelemy, M.: Structure of urban movements: polycentric activity and entangled hierarchical flows. PLoS One 6(1), 1–8 (2011)CrossRefGoogle Scholar
  56. 56.
    Rusitschka, S., Curry, E.: Big data in the energy and transport sectors. In: New Horizons for a Data-Driven Economy, pp. 225–244 (2015). Chapter 13Google Scholar
  57. 57.
    Santi, P., Resta, G., Szell, M., Sobolevsky, S., Strogatz, S.H., Ratti, C.: Quantifying the benefits of vehicle pooling with shareability networks. Proc. Natl. Acad. Sci. 111(37), 13290–13294 (2014)CrossRefGoogle Scholar
  58. 58.
    Schmöcker, J.D., Shimamoto, H., Kurauchi, F.: Generation and calibration of transit hyperpaths. Transp. Res. Part C Emerg. Technol. 36, 406–418 (2013)CrossRefGoogle Scholar
  59. 59.
    Schulz, A., Ristoski, P., Paulheim, H.: I see a car crash: real-time detection of small scale incidents in microblogs. In: The Semantic Web: ESWC 2013 Satellite Events, pp. 22–33. Springer (2013)Google Scholar
  60. 60.
    Seaborn, C., Attanucci, J., Wilson, N.H.M.: Analyzing multimodal public transport journeys in London with smart card fare payment data. Transp. Res. Rec. J Transp. Res. Board 2121, 55–62 (2009)CrossRefGoogle Scholar
  61. 61.
    Sharma, S.: Expanded cloud plumes hiding big data ecosystem. Future Gener. Comput. Syst. 59, 63–92 (2016)CrossRefGoogle Scholar
  62. 62.
    Song, C., Qu, Z., Blumm, N., Barabasi, A.-L.: Limits of predictability in human mobility. Science 327(5968), 1018–1021 (2010)MathSciNetCrossRefzbMATHGoogle Scholar
  63. 63.
    Swan, M.: Philosophy of big data: expanding the human-data relation with big data science services. In: Proceedings of First International IEEE Conference of Big Data Computing Service and Applications, pp. 468–477 (2015)Google Scholar
  64. 64.
    Tabbitt, S.: Big data analytics keeps Dublin moving. Accessed 6 May 2015
  65. 65.
    Tamor, M.A., Gearhart, C., Soto, C.: A statistical approach to estimating acceptance of electric vehicles and electrification of personal transportation. Transp. Res. Part C Emerg. Technol. 26, 125–134 (2013)CrossRefGoogle Scholar
  66. 66.
    Toole, J.L., Ulm, M., Gonz, M.C., Bauer, D.I.: Inferring land use from mobile phone activity. In: Proceedings of the ACM SIGKDD International Workshop on Urban Computing, pp. 1–8, Beijing, China (2012)Google Scholar
  67. 67.
    Trépanier, M., Tranchant, N., Chapleau, R.: Individual trip destination estimation in a transit smart card automated fare collection system. J. Intell. Transp. Syst. 11, 1–14 (2007)CrossRefGoogle Scholar
  68. 68.
    van Oort, N., Brands, T., de Romph, E.: Short-term prediction of ridership on public transport with smart card data Transp. Res. Rec. J. Transp. Res. Board 2535, 105–111 (2015)CrossRefGoogle Scholar
  69. 69.
    van Oort, N., Cats, O.: Improving public transport decision making, planning and operations by using big data cases from Sweden and the Netherlands. In: 2015 IEEE 18th International Conference on Intelligent Transportation Systems (ITSC) (2015)Google Scholar
  70. 70.
    Wang, P., Hunter, T., Bayen, A.M., Schechtner, K., Gonzalez, M.C.: Understanding road usage patterns in urban areas. Sci. Rep. 2, 1001 (2012)CrossRefGoogle Scholar
  71. 71.
    Wang, X., Zeng, K., Zhao, X.L., Wang, F.Y.: Using web data to enhance traffic situation awareness. In: IEEE 17th International Conference on Intelligent Transportation Systems (ITSC), pp. 195–199. IEEE (2014b)Google Scholar
  72. 72.
    Wang, Y., Zheng, Y., Xue, Y.: Travel time estimation of a path using sparse trajectories. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 25–34. ACM (2014)Google Scholar
  73. 73.
    Wanichayapong, N., Pruthipunyaskul, W., Pattara-Atikom, W., Chaovalit, P.: Social-based traffic information extraction and classification. In: 11th International Conference on ITS Telecommunications (ITST), pp. 107–112. IEEE (2011)Google Scholar
  74. 74.
    Wanq, Q., Taylor, J.E.: Quantifying Human Mobility Perturbation and Resilience in Hurricane Sandy (2014)Google Scholar
  75. 75.
    Watson, H.J.: Tutorial: big data analytics: concepts, technology, and applications. Assoc. Inf. Syst. 34, 5–16 (2014)Google Scholar
  76. 76.
    Weinstein, S.L.: Innovations in London’s transport: big data for a better customer experience. Accessed 20 Aug 2017
  77. 77.
    Widhalm, P., Yang, Y., Ulm, M., Athavale, S., González, M.: Discovering urban activity patterns in cell phone data. Transportation 42, 1–27 (2015)CrossRefGoogle Scholar
  78. 78.
    Wood, S.A., Guerry, A.D., Silver, J.M., Lacayo, M.: Using social media to quantify nature-based tourism and recreation. Sci. Rep. 3, 2976 (2013)CrossRefGoogle Scholar
  79. 79.
    Yeung, C.H., Saad, D., Wong, K.M.: From the physics of interacting polymers to optimizing routes on the London underground. Proc. Natl. Acad. Sci. 110(34), 13717–13722 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
  80. 80.
    Yu, W., Mao, M., Wang, B., Liu, X.: Implementation evaluation of Beijing urban master plan based on subway transit smart card data. In: 22nd International Conference on Geoinformatics, Kaohsiung, Taiwan (2014)Google Scholar
  81. 81.
    Yuan, N.J., et al.: T-finder: a recommender system for finding passengers and vacant taxis. IEEE Trans. Knowl. Data Eng. 25, 2390–2403 (2013)CrossRefGoogle Scholar
  82. 82.
    Zhang, W., Qi, G., Pan, G., Lu, H., Li, S., Wu, Z.: City-scale social event detection and evaluation with taxi traces. ACM Trans. Intell. Syst. Technol. 6(3), 1–20 (2015)Google Scholar
  83. 83.
    Zheng, X., Chen, W., Wang, P., Shen, D., Chen, S., Wang, X., Zhang, Q., Yang, L.: Big data for social transportation. IEEE Trans. Intell. Transp. Syst. 17(3), 620–630 (2016)CrossRefGoogle Scholar
  84. 84.
    Zheng, Y., Zhang, L., Xie, X., Ma, W.-Y.: Mining interesting locations and travel sequences from GPS trajectorie. In: Proceedings of International Conference on World Wide Web (WWW 2009), Madrid, Spain, pp. 791–800. ACM Press (2009)Google Scholar

Copyright information

© Springer International Publishing AG 2018

Authors and Affiliations

  1. 1.Department of Civil EngineeringUniversity of ThessalyVolosGreece

Personalised recommendations