Skip to main content

Analyzing large-scale human mobility data: a survey of machine learning methods and applications


Human mobility patterns reflect many aspects of life, from the global spread of infectious diseases to urban planning and daily commute patterns. In recent years, the prevalence of positioning methods and technologies, such as the global positioning system, cellular radio tower geo-positioning, and WiFi positioning systems, has driven efforts to collect human mobility data and to mine patterns of interest within these data in order to promote the development of location-based services and applications. The efforts to mine significant patterns within large-scale, high-dimensional mobility data have solicited use of advanced analysis techniques, usually based on machine learning methods, and therefore, in this paper, we survey and assess different approaches and models that analyze and learn human mobility patterns using mainly machine learning methods. We categorize these approaches and models in a taxonomy based on their positioning characteristics, the scale of analysis, the properties of the modeling approach, and the class of applications they can serve. We find that these applications can be categorized into three classes: user modeling, place modeling, and trajectory modeling, each class with its characteristics. Finally, we analyze the short-term trends and future challenges of human mobility analysis.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5






















  21. There are other paging policies that use time and spatial-based approaches for location management, e.g., as in Krishnamachari et al. [46] and Xiao et al. [94].

  22. Note the difference to trajectory modeling (Sect. 4.3), in which moving objects along a route are analyzed together.


  1. Andrienko G, Andrienko N, Hurter C, Rinzivillo S, Wrobel S (2011) From movement tracks through events to places: extracting and characterizing significant places from mobility data. In: 2011 IEEE conference on visual analytics science and technology (VAST). IEEE, pp 161–170

  2. Andrienko N, Andrienko G, Stange H, Liebig T, Hecker D (2012) Visual analytics for understanding spatial situations from episodic movement data. Künstliche Intell 26(3):241–251

    Article  Google Scholar 

  3. Andrienko G, Divanis AG, Gruteser M, Kopp C, Liebig T, Rechert K (2013) Report from Dagstuhl: the liberation of mobile location data and its implications for privacy research. ACM SIGMOBILE Mob Comput Commun Rev 17(2):7–18

    Article  Google Scholar 

  4. Ashbrook D, Starner T (2003) Using GPS to learn significant locations and predict movement across multiple users. Pers Ubiquitous Comput 7(5):275–286

    Article  Google Scholar 

  5. Balcan D, Colizza V, Gonçalves B, Hu H, Ramasco JJ, Vespignani A (2009) Multiscale mobility networks and the spatial spreading of infectious diseases. Proc Natl Acad Sci 106(51):21484–21489

    Article  Google Scholar 

  6. Barak O, Cohen G, Toch E (2016) Anonymizing mobility data using semantic cloaking. Pervasive Mob Comput 28:102–112

    Article  Google Scholar 

  7. Barnes RM (1958) Time and motion study. Wiley, New York

    Google Scholar 

  8. Becker RA, Caceres R, Hanson K, Loh JM, Urbanek S, Varshavsky A, Volinsky C (2011) A tale of one city: using cellular network data for urban planning. IEEE Pervasive Comput 10(4):18–26

    Article  Google Scholar 

  9. Bengtsson L, Lu X, Thorson A, Garfield R, Von Schreeb J (2011) Improved response to disasters and outbreaks by tracking population movements with mobile phone network data: a post-earthquake geospatial study in Haiti. PLoS Med 8(8):1001083

    Article  Google Scholar 

  10. Ben-Zion E, Lerner B (2017) Learning human behaviors and lifestyle by capturing temporal relations in mobility patterns. In: Proceedings of the European symposium on artificial networks, computational intelligence and machine learning (ESANN), Bruges

  11. Berlingerio M, Calabrese F, Di Lorenzo G, Nair R, Pinelli F, Sbodio ML (2013) Allaboard: a system for exploring urban mobility and optimizing public transport using cellphone data. In: Machine learning and knowledge discovery in databases. Springer, pp 663–666

  12. Bricka SG, Sen S, Paleti R, Bhat CR (2012) An analysis of the factors influencing differences in survey-reported and GPS-recorded trips. Transp Res Part C Emerg Technol 21(1):67–88

    Article  Google Scholar 

  13. Brockmann D, Hufnagel L, Geisel T (2006) The scaling laws of human travel. Nature 439(7075):462–465

    Article  Google Scholar 

  14. Buthpitiya S, Zhang Y, Dey AK, Griss M (2011) n-gram geo-trace modeling. In: Lyons K, Hightower J, Huang EM (eds) Pervasive computing. Lecture notes in computer science, vol 6696. Springer, Berlin, pp 97–114

    Google Scholar 

  15. Calabrese F, Colonna M, Lovisolo P, Parata D, Ratti C (2011) Real-time urban monitoring using cell phones: a case study in Rome. IEEE Trans Intell Transp Syst 12(1):141–151

    Article  Google Scholar 

  16. Cao X, Cong G, Jensen CS (2010) Mining significant semantic locations from GPS data. Proc VLDB Endow 3:1009–1020

    Article  Google Scholar 

  17. Castro PS, Zhang D, Chen C, Li S, Pan G (2013) From taxi GPS traces to social and community dynamics: a survey. ACM Comput Surv (CSUR) 46(2):17

    Article  Google Scholar 

  18. Castro PS, Zhang D, Li S (2012) Urban traffic modelling and prediction using large scale taxi GPS traces. In: Pervasive computing. Lecture notes in computer science, vol 7319. Springer, Berlin, pp 57–72

  19. Cheng AJ, Chen YY, Huang YT, Hsu WH, Liao HYM (2011) Personalized travel recommendation by mining people attributes from community-contributed photos. In: Proceedings of the 19th ACM international conference on multimedia. ACM, pp 83–92

  20. Cho E, Myers SA, Leskovec J (2011) Friendship and mobility: user movement in location-based social networks. In: Proceedings of the 17th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 1082–1090

  21. Cranshaw J, Toch E, Hong J, Kittur A, Sadeh N (2010) Bridging the gap between physical location and online social networks. In: Proceedings of the 12th ACM international conference on ubiquitous computing. ACM, pp 119–128

  22. Cui J, Liu F, Janssens D, An S, Wets G, Cools M (2016) Detecting urban road network accessibility problems using taxi GPS data. J Transp Geogr 51:147–157

    Article  Google Scholar 

  23. Do T-M-T, Gatica-Perez D (2013) The places of our lives: visiting patterns and automatic labeling from longitudinal smartphone data, Technical report EPFL-ARTICLE-192391

  24. Douglass CJ, Bovls HW (1957) Predicting local travel in urban regions. Pap Reg Sci 3(1):183–197

    Article  Google Scholar 

  25. Eagle N, Pentland AS (2006) Reality mining: sensing complex social systems. Pers Ubiquitous Comput 10(4):255–268

    Article  Google Scholar 

  26. Eagle N, Pentland AS (2009) Eigenbehaviors: identifying structure in routine. Behav Ecol Sociobiol 63(7):1057–1066

    Article  Google Scholar 

  27. Eagle N, Pentland AS, Lazer D (2009) Inferring friendship network structure by using mobile phone data. Proc Natl Acad Sci 106(36):15274–15278

    Article  Google Scholar 

  28. Ekman F, Keränen, A, Karvo J, Ott J (2008) Working day movement model. In: Proceedings of the 1st ACM SIGMOBILE workshop on mobility models. ACM, pp 33–40

  29. Espin-Noboa L, Lemmerich F, Singer P, Strohmaier M (2016) Discovering and characterizing mobility patterns in urban spaces: a study of Manhattan taxi data. In: WWW’16 companion, pp 537–542

  30. Etter V, Kafsi M, Kazemi E (2012) Been there, done that: What your mobility traces reveal about your behavior. In: Mobile data challenge by Nokia Workshop, in conjunction with international conference on pervasive computing, number EPFL-CONF-178426

  31. Fan Z, Song X, Shibasaki R (2014) CitySpectrum: a non-negative tensor factorization approach. In: Proceedings of the 2014 ACM international joint conference on pervasive and ubiquitous computing. ACM, pp 213–223

  32. Farrahi K, Gatica-Perez D (2011) Discovering routines from large-scale human locations using probabilistic topic models. ACM Trans Intell Syst Technol (TIST) 2(1):3

    Google Scholar 

  33. Frias-Martinez V, Soguero-Ruiz C, Frias-Martinez E, Josephidou M (2013) Forecasting socioeconomic trends with cell phone records. In: Proceedings of the 3rd ACM symposium on computing for development. ACM

  34. Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning, data mining, inference, and prediction. Springer series in statistics. Springer, Berlin

    MATH  Google Scholar 

  35. Giannotti F, Nanni M, Pedreschi D, Pinelli F (2007) Trajectory pattern mining. In: Proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 330–339

  36. Golle P, Partridge K (2009) On the anonymity of home/work location pairs. In: Tokuda H et al. (eds) Pervasive, LNCS 5538. Springer, pp 390–397

  37. Gonzalez MC, Hidalgo CA, Barabasi A-L (2008) Understanding individual human mobility patterns. Nature 453(7196):779–782

    Article  Google Scholar 

  38. Hariharan R, Toyama K (2004) Project lachesis: parsing and modeling location histories. In: Egenhofer MJ et al. (eds) Geographic information science, LNCS 3234. Springer, pp 106–124

  39. Jensen B, Larsen JE, Jensen K, Larsen J, Hansen LK (2010) Estimating human predictability from mobile sensor data. In: 2010 IEEE international workshop on machine learning for signal processing (MLSP). IEEE, pp 196–201

  40. Jiang S, Fiore GA, Yang Y, Ferreira Jr. J, Frazzoli E, González MC (2013) A review of urban computing for mobile phone traces: current methods, challenges and opportunities. In: Proceedings of the 2nd ACM SIGKDD international workshop on urban computing. ACM, p 2

  41. Kagan E, Ben-Gal I (2013) Probabilistic search for tracking targets: theory and modern applications. Wiley, Hoboken

    Book  Google Scholar 

  42. Kagan E, Ben-Gal I (2015) Search and foraging: individual motion and swarm dynamics. CRC Press, Boca Raton

    Book  MATH  Google Scholar 

  43. Kitamura R, Chen C, Pendyala RM, Narayanan R (2000) Micro-simulation of daily activity-travel patterns for travel demand forecasting. Transportation 27(1):25–51

    Article  Google Scholar 

  44. Khoroshevsky F, Lerner B (2017) Human mobility-pattern discovery and next-place prediction from GPS data. In: Schwenker F, Scherer S (eds) Multimodal pattern recognition of social signals in human computer interaction (MPRSS). Lecture notes in computer science, vol 10183. Springer, Berlin

    Google Scholar 

  45. Krings G, Calabrese F, Ratti C, Blondel VD (2009) Urban gravity: a model for inter-city telecommunication flows. J Stat Mech Theory Exp 2009(07):L07003

    Article  Google Scholar 

  46. Krishnamachari B, Gau R-H, Wicker SB, Haas ZJ (2004) Optimal sequential paging in cellular wireless networks. Wirel Netw 10(2):121–131

    Article  Google Scholar 

  47. Krumm J (2009) A survey of computational location privacy. Pers Ubiquitous Comput 13(6):391–399

    Article  Google Scholar 

  48. Krumm J, Horvitz E (2006) Predestination: inferring destinations from partial trajectories. In: UbiComp 2006: ubiquitous computing. Springer, pp 243–260

  49. Krumm J, Rouhana D (2013) Placer: semantic place labels from diary data. In: Proceedings of the 2013 ACM international joint conference on pervasive and ubiquitous computing. ACM, pp 163–172

  50. Lee J-G, Han J, Li X, Cheng H (2011) Mining discriminative patterns for classifying trajectories on road networks. IEEE Trans Knowl Data Eng 23(5):713–726

    Article  Google Scholar 

  51. Lee J-G, Han J, Whang K-Y (2007) Trajectory clustering: A partition-and-group framework. In: Proceedings of the 2007 ACM SIGMOD international conference on management of data. ACM, pp 593–604

  52. Li B, Zhang D, Sun L, Chen C, Li S, Qi G, Yang Q (2011) Hunting or waiting? Discovering passenger-finding strategies from a large-scale real-world taxi dataset. In: 2011 IEEE international conference on pervasive computing and communications workshops (PERCOM workshops). IEEE, pp 63–68

  53. Lichman M, Smyth P (2014) Modeling human location data with mixtures of kernel densities. In Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 35–44

  54. Lin M, Hsu W-J (2014) Mining GPS data for mobility patterns: a survey. Pervasive Mob Comput 12:1–16

    Article  Google Scholar 

  55. Lin M, Hsu W-J, Lee ZQ (2013) Detecting modes of transport from unlabelled positioning sensor data. J Locat Based Serv 7(4):272–290

    Article  Google Scholar 

  56. Liu H, Darabi H, Banerjee P, Liu J (2007) Survey of wireless indoor positioning techniques and systems. IEEE Tran Syst Man Cybern Part C (Appl Rev) 37(6):1067–1080

    Article  Google Scholar 

  57. Lu X, Bengtsson L, Holme P (2012) Predictability of population displacement after the 2010 Haiti earthquake. Proc Natl Acad Sci 109(29):11576–11581

    Article  Google Scholar 

  58. Luo W, Tan H, Chen L, Ni LM (2013) Finding time period-based most frequent path in big trajectory data. In: Proceedings of the 2013 ACM SIGMOD international conference on management of data. ACM, pp 713–724

  59. Mao B, Cao J, Wu Z, Huang G, Li J (2012) Predicting driving direction with weighted Markov model. In: Advanced data mining and applications. Springer, pp 407–418

  60. Mazimpaka JD, Timpf S (2006) Trajectory data mining: a review of methods and applications. J Spat Inf Sci 2006(06):61–69

    Google Scholar 

  61. Monreale A, Pinelli F, Trasarti R, Giannotti F (2009) WhereNext: a location predictor on trajectory pattern mining. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 637–646

  62. Montjoye Y-A, Hidalgo CA, Verleysen M, Blondel VD (2013) Unique in the crowd: the privacy bounds of human mobility. Sci Rep 3:1376

    Article  Google Scholar 

  63. Montoliu R, Blom J, Gatica-Perez D (2013) Discovering places of interest in everyday life from smartphone data. Multimed Tools Appl 62(1):179–207

    Article  Google Scholar 

  64. Nanni M, Pedreschi D (2006) Time-focused clustering of trajectories of moving objects. J Intell Inf Syst 27(3):267–289

    Article  Google Scholar 

  65. Oloritun RO, Ouarda TB, Moturu S, Madan A, Pentland AS, Khayal I (2013) Change in BMI accurately predicted by social exposure to acquaintances. PLoS ONE 8(11):79238

    Article  Google Scholar 

  66. Orlov YL, Filippov VP, Potapov VN, Kolchanov NA (2002) Construction of stochastic context trees for genetic texts. Silico Biol 2(3):233–247

    Google Scholar 

  67. Patterson DJ, Ding X, Kaufman SJ, Liu K, Zaldivar A (2009) An ecosystem for learning and using sensor-driven IM status messages. IEEE Pervasive Comput 8(4):42–49

    Article  Google Scholar 

  68. Pejovic V, Musolesi M (2015) Anticipatory mobile computing: a survey of the state of the art and research challenges. ACM Comput Surveys (CSUR) 47(3):47

    Article  Google Scholar 

  69. Pelekis N, Kopanakis I, Kotsifakos EE, Frentzos E, Theodoridis Y (2009) Clustering trajectories of moving objects in an uncertain world. In: Ninth IEEE international conference on data mining (ICDM09). IEEE, pp 417–427

  70. Phithakkitnukoon S, Smoreda Z, Olivier P (2012) Socio-geography of human mobility: a study using longitudinal mobile phone data. PLoS ONE 7(6):39253

    Article  Google Scholar 

  71. Poushter J (2016) Smartphone ownership and internet usage continues to climb in emerging economies, Pew Research Center

  72. Qu M, Zhu H, Liu J, Liu G, Xiong H (2014) A cost-effective recommender system for taxi drivers. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 45–54

  73. Reades J, Calabrese F, Sevtsuk A, Ratti C (2007) Cellular census: explorations in urban data collection. Pervasive Comput IEEE 6(3):30–38

    Article  Google Scholar 

  74. Reddy S, Mun M, Burke J, Estrin D, Hansen M, Srivastava M (2010) Using mobile phones to determine transportation modes. ACM Trans Sens Netw (TOSN) 6(2):13

    Google Scholar 

  75. Rösler R, Liebig T (2013) Using data from location based social networks for urban activity clustering. In: Vandenbroucke D, Bucher B, Crompvoets J (eds) Geographic information science at the heart of Europe. Lecture notes in geoinformation and cartography. Springer, Berlin

    Google Scholar 

  76. Sadilek A, Krumm J, Horvitz E (2013) Crowdphysics: planned and opportunistic crowdsourcing for physical tasks. In: Seventh international AAAI conference on weblogs and social media

  77. Scellato S, Musolesi M, Mascolo C, Latora V, Campbell AT (2011) Nextplace: a spatio-temporal prediction framework for pervasive systems. In: International conference on pervasive computing. Springer, Berlin, pp 152–169

  78. Shoval N et al (2008) The use of advanced tracking technologies for the analysis of mobility in Alzheimer’s disease and related cognitive diseases. BMC Geriatr 8(1):7

    Article  Google Scholar 

  79. Sohn T, Varshavsky A, LaMarca A, Chen MY, Choudhury T, Smith I, Consolvo S, Hightower J, Griswold WG, De Lara E (2006) Mobility detection using everyday GSM traces. In: UbiComp 2006: ubiquitous computing, LNCS 4206. Springer, pp 212–224

  80. Song C, Qu Z, Blumm N, Barabási AL (2010) Limits of predictability in human mobility. Science 327(5968):1018–1021

    MathSciNet  Article  MATH  Google Scholar 

  81. Song C, Koren T, Wang P, Barabási AL (2010) Modeling the scaling properties of human mobility. Nat Phys 6:818–823

    Article  Google Scholar 

  82. Song L, Kotz D, Jain R, He X (2006) Evaluating next-cell predictors with extensive Wi-Fi mobility data. IEEE Trans Mob Comput 5(12):1633–1649

    Article  Google Scholar 

  83. Song X, Shibasaki R, Yuan NJ, Xie X, Li T, Adachi R (2017) DeepMob: learning deep knowledge of human emergency behavior and mobility from big and heterogeneous data. ACM Trans Inf Syst (TOIS) 35(4):41

    Article  Google Scholar 

  84. Souto G, Liebig T (2016) On event detection from spatial time series for urban traffic applications. In: Michaelis S, Piatkowski N, Stolpe M (eds) Solving large scale learning tasks. Challenges and algorithms. Lecture notes in computer science, vol 9580. Springer

  85. Szalai A (1966) Trends in comparative time-budget research. Am Behav Sci 9(9):3–8

    Article  Google Scholar 

  86. Toch E, Cranshaw J, Drielsma PH, Tsai JY, Kelley PG, Springfield J, Cranor L, Hong J, Sadeh N (2010), Empirical models of privacy in location sharing. In: Proceedings of the 12th ACM international conference on ubiquitous computing, Ubicomp 10. ACM, New York, pp 129–138

  87. Tong Y, Chen Y, Zhou Z, Chen L, Wang J, Yang, Q, Lv W (2017) The simpler the better: a unified approach to predicting original taxi demands based on large-scale online platforms. In: Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 1653–1662

  88. Tsai JY, Kelley PG, Cranor LF, Sadeh N (2010) Location-sharing technologies: privacy risks and controls. I/S J Law Policy Inf Soc 6:119

    Google Scholar 

  89. Wang H, Fu Y, Wang Q, Yin H, Du C, Xiong H (2017) A location-sentiment-aware recommender system for both home-town and out-of-town users. In: Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 1135–1143

  90. Wang J, Prabhala B (2012) Periodicity based next place prediction. In: Proceeding of the Nokia mobile data challenge workshop

  91. Wang P, González MC, Hidalgo CA, Barabási A-L (2009) Understanding the spreading patterns of mobile phone viruses. Science 324(5930):1071–1076

    Article  Google Scholar 

  92. Wei LY, Zheng Y, Peng WC (2012) Constructing popular routes from uncertain trajectories. In: Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 195–203

  93. Wesolowski A, Eagle N, Tatem AJ, Smith DL, Noor AM, Snow RW, Buckee CO (2012) Quantifying the impact of human mobility on malaria. Science 338(6104):267–270

    Article  Google Scholar 

  94. Xiao Y, Pan Y, Li J (2004) Design and analysis of location management for 3G cellular networks. IEEE Trans Parallel Distrib Syst 15(4):339–349

    Article  Google Scholar 

  95. Yan Z, Chakraborty D, Parent C, Spaccapietra S, Aberer K (2013) Semantic trajectories: mobility data computation and annotation. ACM Trans Intell Syst Technol (TIST) 4(3):49

    Google Scholar 

  96. Ying JJ-C, Lee W-C, Tseng VS (2013) Mining geographic-temporal-semantic patterns in trajectories for location prediction. ACM Trans Intell Syst Technol (TIST) 5(1):2

    Google Scholar 

  97. Zaslavsky A, Chakraborty D et al. (2011) Recognizing concurrent and interleaved activities in social interactions. In: IEEE 9th international conference on dependable, autonomic and secure computing (DASC). IEEE, pp 230–237

  98. Zhang C, Zhang K, Yuan Q, Zhang L, Hanratty T, Han J (2016) Gmove: group-level mobility modeling using geo-tagged social media. In: KDD: proceedings of international conference on knowledge discovery and data mining, vol 2016, p 1305

  99. Zhang D, Zhang D, Xiong H, Yang LT, Gauthier V (2015) NextCell: predicting location using social interplay from cell phone traces. IEEE Trans Comput 64(2):452–463

    MathSciNet  Article  MATH  Google Scholar 

  100. Zhang JD, Chow CY (2013) iGSLR: personalized geo-social location recommendation: a kernel density estimation approach. In: Proceedings of the 21st ACM SIGSPATIAL international conference on advances in geographic information systems. ACM, pp 334–343

  101. Zheng VW, Zheng Y, Xie X, Yang Q (2010) Collaborative location and activity recommendations with GPS history data. In: Proceedings of the 19th international conference on world wide web. ACM, pp 1029–1038

  102. Zheng Y, Capra L, Wolfson O, Yang H (2014) Urban computing: concepts, methodologies, and applications. ACM Trans Intell Syst Technol (TIST) 5(3):38

    Google Scholar 

  103. Zheng D, Hu T, You Q, Kautz H, Luo J (2014) Inferring home location from user’s photo collections based on visual content and mobility patterns. In: Proceedings of the 3rd ACM multimedia workshop on geotagging and its applications in multimedia, pp 21–26

  104. Zhu Y, Zhong E, Lu Z, Yang Q (2012) Feature engineering for place category classification. In: Mobile data challenge by Nokia workshop, in conjunction with international conference on pervasive computing

Download references


This work is supported by the Israeli Ministry of Science, Technology, and Space, Grant No. 3-8709: Learning and mining mobility patterns using stochastic models. We would like to thank Omer Barak and Gabriella Cohen for their help in collecting data for the survey.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Eran Toch.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Toch, E., Lerner, B., Ben-Zion, E. et al. Analyzing large-scale human mobility data: a survey of machine learning methods and applications. Knowl Inf Syst 58, 501–523 (2019).

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:


  • Human mobility patterns
  • Mobile phones
  • Machine learning
  • Data mining