Integrating GPS trajectory and topics from Twitter stream for human mobility estimation


Understanding urban dynamics and large-scale human mobility will play a vital role in building smart cities and sustainable urbanization. Existing research in this domain mainly focuses on a single data source (e.g., GPS data, CDR data, etc.). In this study, we collect big and heterogeneous data and aim to investigate and discover the relationship between spatiotemporal topics found in geo-tagged tweets and GPS traces from smartphones. We employ Latent Dirichlet Allocation-based topic modeling on geo-tagged tweets to extract and classify the topics. Then the extracted topics from tweets and temporal population distribution from GPS traces are jointly used to model urban dynamics and human crowd flow. The experimental results and validations demonstrate the efficiency of our approach and suggest that the fusion of cross-domain data for urban dynamics modeling is more practical than previously thought.

This is a preview of subscription content, log in to check access.


  1. 1.

    Zheng Y, Capra L, Wolfson O, Yang H. Urban computing: concepts, methodologies, and applications. ACM Transactions on Intelligent Systems and Technology, 2014, 5(3): 38

    Google Scholar 

  2. 2.

    Zhang D, Wang Z, Guo B, Yu Z. Social and community intelligence: technologies and trends. IEEE Software, 2012, 29(4): 88–92

    Article  Google Scholar 

  3. 3.

    Xiong Z, Zheng Y, Li C. Data vitalization’s perspective towards smart city: a reference model for data service oriented architecture. In: Proceedings of the 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing. 2014, 865–874

    Google Scholar 

  4. 4.

    Calabrese F, Diao M, Lorenzo G D, Ferreira J, Ratti C. Understanding individual mobility patterns from urban sensing data: a mobile phone trace example. Transportation Research Part C: Emerging Technologies, 2013, 26: 301–313

    Article  Google Scholar 

  5. 5.

    Kang C, Ma X, Tong D, Liu Y. Intra-urban human mobility patterns: an urban morphology perspective. Physica A: Statistical Mechanics and its Applications, 2012, 391(4): 1702–1717

    Article  Google Scholar 

  6. 6.

    Song X, Zhang Q, Sekimoto Y, Shibasaki R. Prediction of human emergency behavior and their mobility following large-scale disaster. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2014, 5–14

    Google Scholar 

  7. 7.

    Zhai Z, Liu B, Wang J, Xu H, Jia P. Product feature grouping for opinion mining. IEEE Intelligent Systems, 2012, 27(4): 37–44

    Article  Google Scholar 

  8. 8.

    Kim Y, Han J, Yuan C. TOPTRAC: topical trajectory pattern mining. In: Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2015, 587–596

    Google Scholar 

  9. 9.

    Cheng T, Wicks T. Event detection using Twitter: a spatio-temporal approach. PLoS One, 2014, 9(6): e97807

    Article  Google Scholar 

  10. 10.

    Grinberger Y, Shoval N. A temporal-contextual analysis of urban dynamics using location-based data. International Journal of Geographical Information Science, 2015, 29(11): 1969–1987.

    Article  Google Scholar 

  11. 11.

    Spaccapietra S, Parent C, Damiani M L, De Macedo J A, Porto F, Vangenot C. A conceptual view on trajectories. Data & knowledge engineering, 2008, 65(1): 126–146.

    Article  Google Scholar 

  12. 12.

    Sekimoto Y, Shibasaki R, Kanasugi H, Usui T, Shimazaki Y. PFlow: reconstructing people flow recycling large-scale social survey data. IEEE Pervasive Computing, 2011, 10(4), 27–35

    Article  Google Scholar 

  13. 13.

    Wang J, Gu Q, Wu J, Liu G, Xiong Z. Traffic speed prediction and congestion source exploration: a deep learning method. In: Proceedings of the 16th IEEE International Conference on Data Mining. 2016, 499–508

    Google Scholar 

  14. 14.

    Wang J, Gao F, Cui P, Li C, Xiong Z. Discovering urban spatiotemporal structure from time-evolving traffic networks. In: Proceedings of the 16th Asia-Pacific Web Conference. 2014, 93–104

    Google Scholar 

  15. 15.

    Dong W, Wang Y, Yu H. An identification model of urban critical links with macroscopic fundamental diagram theory. Frontiers of Computer Science, 2017, 11(1): 27–37

    MathSciNet  Article  Google Scholar 

  16. 16.

    Chen L, Ma X, Pan G, Jakubowicz J. Understanding bike trip patterns leveraging bike sharing system open data. Frontiers of Computer Science, 2017, 11(1): 38–48

    Article  Google Scholar 

  17. 17.

    Wang J, Wang Y, Zhang D, Wang L, Chen C, Lee J W, He Y. Realtime and generic queue time estimation based on mobile crowdsensing. Frontiers of Computer Science, 2017, 11(1): 49–60

    Article  Google Scholar 

  18. 18.

    Chen C, Chen X, Wang Z, Wang Y, Zhang D. Scenicplanner: planning scenic travel routes leveraging heterogeneous user-generated digital footprints. Frontiers of Computer Science, 2017, 11(1): 61–74

    Article  Google Scholar 

  19. 19.

    Song X, Zhang Q, Sekimoto Y, Horanont T, Ueyama S, Shibasaki R. Modeling and probabilistic reasoning of population evacuation during large-scale disaster. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2013, 1231–1239

    Google Scholar 

  20. 20.

    Wang J, Chen C, Wu J, Xiong Z. No longer sleeping with a bomb: A duet system for protecting urban safety from dangerous goods. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2017, 1673–1681

    Google Scholar 

  21. 21.

    Wang J, Lin Y, Wu J, Wang Z, Xiong Z. Coupling implicit and explicit knowledge for customer volume prediction. In: Proceedings of the 31st AAAI Conference on Artificial Intelligence. 2017, 1569–1575

    Google Scholar 

  22. 22.

    Fan Z, Song X, Shibasaki R. CitySpectrum: anon-negative tensor factorization approach. In: Proceedings of ACM International Joint Conference on Pervasive and Ubiquitous Computing. 2014, 213–223

    Google Scholar 

  23. 23.

    Yuan J, Zheng Y, Xie X. Discovering regions of different functions in a city using human mobility and POIs. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2012, 186–194

    Google Scholar 

  24. 24.

    Guo B, Wang Z, Yu Z, Wang Y, Yen N Y, Huang R, Zhou, X. Mobile crowd sensing and computing: the review of an emerging humanpowered sensing paradigm. ACM Computing Surveys, 2015, 48(1): 7

    Article  Google Scholar 

  25. 25.

    Morstatter F, Pfeffer J, Liu H, Carley K M. Is the Sample Good Enough? Comparing data from Twitter’s streaming API with Twitter’s firehose. In: Proceedings of ICWSM. 2013, 400–408

    Google Scholar 

  26. 26.

    Steiger E, De Albuquerque J P, Zipf A. An advanced systematic literature review on spatiotemporal analyses of Twitter data. Transactions in GIS, 2015, 19(6): 809–834

    Article  Google Scholar 

  27. 27.

    Cheng T, Wicks T. Event detection using Twitter: a spatio-temporal approach. PLoS One, 2014 9(6), e97807

    Article  Google Scholar 

  28. 28.

    Sakaki T, Okazaki M, Matsuo Y. Tweet analysis for real-time event detection and earthquake reporting system development. IEEE Transactions on Knowledge and Data Engineering, 2013, 25(4): 919–931

    Article  Google Scholar 

  29. 29.

    Abbasi A, Rashidi T H, Maghrebi M, Waller S T. Utilising location based social media in travel survey methods: bringing Twitter data into the play. In: Proceedings of the 8th ACM SIGSPATIAL International Workshop on Location-Based Social Networks. 2015, 1–9

    Google Scholar 

  30. 30.

    Ao J, Zhang P, Cao Y. Estimating the locations of emergency events from Twitter streams. Procedia Computer Science, 2014, 31: 731–739

    Article  Google Scholar 

  31. 31.

    Cameron M A, Power R, Robinson B, Yin J. Emergency situation awareness from twitter for crisis management. In: Proceedings of the 21st International Conference on World Wide Web. 2012, 695–698

    Google Scholar 

  32. 32.

    Frias-Martinez V, Frias-Martinez E. Spectral clustering for sensing urban land use using Twitter activity. Engineering Applications of Artificial Intelligence, 2014, 35: 237–245

    Article  Google Scholar 

  33. 33.

    Jurdak R, Zhao K, Liu J, AbouJaoude M, Cameron M, Newth D. Understanding human mobility from Twitter. PLoS One, 2015, 10(7): e0131469

    Article  Google Scholar 

  34. 34.

    Blanford J I, Huang Z, Savelyev A, MacEachren A M. Geo-located tweets. Enhancing mobility maps and capturing cross-border movement. PLoS One, 2015, 10(6): e0129202

    Article  Google Scholar 

  35. 35.

    Hawelka B, Sitko I, Beinat E, Sobolevsky S, Kazakopoulos P, Ratti C. Geo-located Twitter as proxy for global mobility patterns. Cartography and Geographic Information Science, 2014, 41(3): 260–271

    Article  Google Scholar 

  36. 36.

    Pan B, Zheng Y, Wilkie D, Shahabi C. Crowd sensing of traffic anomalies based on human mobility and social media. In: Proceedings of the 21st ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems. 2013, 334–343

    Google Scholar 

  37. 37.

    Řehuřek R. Subspace tracking for latent semantic analysis. In: Proceedings of the 33rd European Conference on Advances in Information Retrieval. 2011, 289–300

    Google Scholar 

  38. 38.

    Blei D M, Ng A Y, Jordan M I. Latent dirichlet allocation. Journal of machine Learning research, 2003, 3(4-5), 993–1022.

    MATH  Google Scholar 

  39. 39.

    Hoffman M D, Blei D M, Bach F. Online learning for latent dirichlet allocation. In: Proceedings of the Neural Information Processing Systems Conference. 2010, 856–864

    Google Scholar 

  40. 40.

    Zheng Y. Methodologies for cross-domain data fusion: an overview. IEEE Transactions on Big Data, 2015, 1(1): 16–34.

    Article  Google Scholar 

  41. 41.

    Wang J, He X, Wang Z, Wu J W, Yuan N J, Xie X, Xiong Z. CD-CNN: a partially supervised cross-domain deep learning model for urban resident recognition. In: Proceedings of the 32nd AAAI Conference on Artificial Intelligence. 2018

    Google Scholar 

Download references


This work was partially supported by JST, Strategic International Collaborative Research Program (SICORP); Grant in-Aid for Scientific Research B (17H01784) and Grant in-Aid for Young Scientists (26730113) of Japan’s Ministry of Education, Culture, Sports, Science, and Technology (MEXT). We specially thank ZENRIN DataCom CO., LTD for the provision of GPS data and their support, and Nightley Inc. for geo-tagged tweets.

Author information



Corresponding author

Correspondence to Satoshi Miyazawa.

Additional information

Satoshi Miyazawa is a PhD student of the Department of Socio-Cultural Environmental Studies at The University of Tokyo, Japan. His research interests include human mobility, LBSN, data mining, and machine learning.

Xuan Song received the BS degree in information engineering from the Jilin University, China in 2005 and PhD degree in signal and information processing from Peking University, China in 2010. From 2010 to 2012, he joined the Center for Spatial Information Science, The University of Tokyo, Japan as a postdoctoral researcher. In 2012 and 2015, he was promoted to project assistant professor and project associate professor at the same university. His research areas are mainly in artificial intelligence and data mining.

Tianqi Xia is a master student of the Department of Socio-Cultural Environmental Studies, The University of Tokyo, Japan. He received his BS degree in geographic information science from Wuhan University, China. His research interests include spatial data mining, data analysis and intelligent transportation systems.

Ryosuke Shibasaki is a professor at the Center for Spatial Information Science, The University of Tokyo, Japan. His research interests include satellite and airborne remote sensing, tracking technologies, geospatial information gathering and integration among heterogeneous systems, and common service platforms for geospatial information.

Hodaka Kaneda is an employee of ZENRIN-Datacom CO., LTD, Japan. His work is to deal with GPS data and to supply “Konzatsu-Tokei (R)” Data.

Electronic supplementary material

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Miyazawa, S., Song, X., Xia, T. et al. Integrating GPS trajectory and topics from Twitter stream for human mobility estimation. Front. Comput. Sci. 13, 460–470 (2019).

Download citation


  • GPS trajectory
  • human mobility
  • SNS
  • location-based social network (LBSN)
  • topic modeling
  • data mining
  • spatiotemporal topic