Combining Bayesian Inference and Clustering for Transport Mode Detection from Sparse and Noisy Geolocation Data

  • Danya BachirEmail author
  • Ghazaleh Khodabandelou
  • Vincent Gauthier
  • Mounim El Yacoubi
  • Eric Vachon
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11053)


Large-scale and real-time transport mode detection is an open challenge for smart transport research. Although massive mobility data is collected from smartphones, mining mobile network geolocation is non-trivial as it is a sparse, coarse and noisy data for which real transport labels are unknown. In this study, we process billions of Call Detail Records from the Greater Paris and present the first method for transport mode detection of any traveling device. Cellphones trajectories, which are anonymized and aggregated, are constructed as sequences of visited locations, called sectors. Clustering and Bayesian inference are combined to estimate transport probabilities for each trajectory. First, we apply clustering on sectors. Features are constructed using spatial information from mobile networks and transport networks. Then, we extract a subset of \(15\%\) sectors, having road and rail labels (e.g., train stations), while remaining sectors are multi-modal. The proportion of labels per cluster is used to calculate transport probabilities given each visited sector. Thus, with Bayesian inference, each record updates the transport probability of the trajectory, without requiring the exact itinerary. For validation, we use the travel survey to compare daily average trips per user. With Pearson correlations reaching 0.96 for road and rail trips, the model appears performant and robust to noise and sparsity.


Mobile phone geolocation Call Detail Records Trajectory mining Transport mode Clustering Bayesian inference Big Data 



This research work has been carried out in the framework of IRT SystemX, Paris-Saclay, France, and therefore granted with public funds within the scope of the French Program “Investissements d’Avenir”. This work has been conducted in collaboration with Bouygues Telecom Big Data Lab.


  1. 1.
  2. 2.
  3. 3.
  4. 4.
    Alexander, L., Jiang, S., Murga, M., González, M.C.: Origin-destination trips by purpose and time of day inferred from mobile phone data. Transp. Res. Part C: Emerg. Technol. 58, 240–250 (2015)CrossRefGoogle Scholar
  5. 5.
    Bachir, D., Gauthier, V., El Yacoubi, M., Khodabandelou, G.: Using mobile phone data analysis for the estimation of daily urban dynamics. In: 2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC), pp. 626–632. IEEE (2017)Google Scholar
  6. 6.
    Bagrow, J.P., Wang, D., Barabasi, A.-L.: Collective response of human populations to large-scale emergencies. PloS One 6(3), e17680 (2011)CrossRefGoogle Scholar
  7. 7.
    Berlingerio, M., et al.: Allaboard: a system for exploring urban mobility and optimizing public transport using cellphone data. vol. pt.III. IBM Research, Dublin, Ireland (2013)CrossRefGoogle Scholar
  8. 8.
    Biljecki, F., Ledoux, H., Van Oosterom, P.: Transportation mode-based segmentation and classification of movement trajectories. Int. J. Geogr. Inf. Sci. 27(2), 385–407 (2013)CrossRefGoogle Scholar
  9. 9.
    Calabrese, F., Di Lorenzo, G., Liu, L., Ratti, C.: Estimating origin-destination flows using mobile phone location data. IEEE Pervasive Comput. 10(4), 36–44 (2011)CrossRefGoogle Scholar
  10. 10.
    Gonzalez, P., et al.: Automating mode detection using neural networks and assisted GPS data collected using GPS-enabled mobile phones. In: 15th World Congress on Intelligent Transportation Systems (2008)Google Scholar
  11. 11.
    Halkidi, M., Vazirgiannis, M.: Clustering validity assessment: finding the optimal partitioning of a data set. In: Proceedings IEEE International Conference on Data Mining, ICDM 2001, pp. 187–194. IEEE (2001)Google Scholar
  12. 12.
    Jiang, S., Ferreira, J., Gonzalez, M.C.: Activity-based human mobility patterns inferred from mobile phone data: a case study of Singapore. IEEE Trans. Big Data 3(2), 208–219 (2017)CrossRefGoogle Scholar
  13. 13.
    Kaufman, L., Rousseeuw, P.J.: Finding Groups in Data: An Introduction to Cluster Analysis, vol. 344. Wiley, Hoboken (2009)zbMATHGoogle Scholar
  14. 14.
    Khodabandelou, G., Gauthier, V., El-Yacoubi, M., Fiore, M.: Population estimation from mobile network traffic metadata. In: 2016 IEEE 17th International Symposium on a World of Wireless, Mobile and Multimedia Networks (WoWMoM), pp. 1–9. IEEE (2016)Google Scholar
  15. 15.
    Larijani, A.N., Olteanu-Raimond, A.-M., Perret, J., Brédif, M., Ziemlicki, C.: Investigating the mobile phone data to estimate the origin destination flow and analysis; case study: Paris region. Transp. Res. Procedia 6, 64–78 (2015)CrossRefGoogle Scholar
  16. 16.
    Liu, Y., Li, Z., Xiong, H., Gao, X., Wu, J.: Understanding of internal clustering validation measures. In: 2010 IEEE 10th International Conference on Data Mining (ICDM), pp. 911–916. IEEE (2010)Google Scholar
  17. 17.
    Pang, L.X., Chawla, S., Liu, W., Zheng, Y.: On detection of emerging anomalous traffic patterns using GPS data. Data Knowl. Eng. 87, 357–373 (2013)CrossRefGoogle Scholar
  18. 18.
    Reddy, S., Mun, M., Burke, J., Estrin, D., Hansen, M., Srivastava, M.: Using mobile phones to determine transportation modes. ACM Trans. Sens. Netw. (TOSN) 6(2), 13 (2010)Google Scholar
  19. 19.
    Toole, J.L., Ulm, M., González, M.C., Bauer, D.: Inferring land use from mobile phone activity. In: Proceedings of the ACM SIGKDD International Workshop on Urban Computing, pp. 1–8. ACM (2012)Google Scholar
  20. 20.
    Wang, H., Calabrese, F., Di Lorenzo, G., Ratti, C.: Transportation mode inference from anonymized and aggregated mobile phone call detail records. In: 2010 13th International IEEE Conference on Intelligent Transportation Systems (ITSC), pp. 318–323. IEEE (2010)Google Scholar
  21. 21.
    Wang, M.-H., Schrock, S.D., Vander Broek, N., Mulinazzi, T.: Estimating dynamic origin-destination data and travel demand using cell phone network data. Int. J. Intell. Transp. Syst. Res. 11(2), 76–86 (2013)Google Scholar
  22. 22.
    Zheng, Y., Chen, Y., Li, Q., Xie, X., Ma, W.-Y.: Understanding transportation modes based on GPS data for web applications. ACM Trans. Web (TWEB) 4(1), 1 (2010)CrossRefGoogle Scholar
  23. 23.
    Zheng, Y., Liu, L., Wang, L., Xie, X.: Learning transportation mode from raw GPS data for geographic applications on the web. In: Proceedings of the 17th International Conference on World Wide Web, pp. 247–256. ACM (2008)Google Scholar
  24. 24.
    Zheng, Y., Xie, X.: Learning travel recommendations from user-generated GPS traces. ACM Trans. Intell. Syst. Technol. (TIST) 2(1), 2 (2011)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Danya Bachir
    • 1
    • 2
    • 3
    Email author
  • Ghazaleh Khodabandelou
    • 2
  • Vincent Gauthier
    • 2
  • Mounim El Yacoubi
    • 2
  • Eric Vachon
    • 3
  1. 1.IRT SystemXPalaiseauFrance
  2. 2.SAMOVAR, Telecom SudParis, CNRS, Université Paris SaclayParisFrance
  3. 3.Bouygues Telecom Big Data LabMeudonFrance

Personalised recommendations