Integrating GPS trajectory and topics from Twitter stream for human mobility estimation

  • Satoshi MiyazawaEmail author
  • Xuan Song
  • Tianqi Xia
  • Ryosuke Shibasaki
  • Hodaka Kaneda
Research Article


Understanding urban dynamics and large-scale human mobility will play a vital role in building smart cities and sustainable urbanization. Existing research in this domain mainly focuses on a single data source (e.g., GPS data, CDR data, etc.). In this study, we collect big and heterogeneous data and aim to investigate and discover the relationship between spatiotemporal topics found in geo-tagged tweets and GPS traces from smartphones. We employ Latent Dirichlet Allocation-based topic modeling on geo-tagged tweets to extract and classify the topics. Then the extracted topics from tweets and temporal population distribution from GPS traces are jointly used to model urban dynamics and human crowd flow. The experimental results and validations demonstrate the efficiency of our approach and suggest that the fusion of cross-domain data for urban dynamics modeling is more practical than previously thought.


GPS trajectory human mobility SNS location-based social network (LBSN) topic modeling data mining spatiotemporal topic 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.



This work was partially supported by JST, Strategic International Collaborative Research Program (SICORP); Grant in-Aid for Scientific Research B (17H01784) and Grant in-Aid for Young Scientists (26730113) of Japan’s Ministry of Education, Culture, Sports, Science, and Technology (MEXT). We specially thank ZENRIN DataCom CO., LTD for the provision of GPS data and their support, and Nightley Inc. for geo-tagged tweets.

Supplementary material

11704_2017_6464_MOESM1_ESM.pptx (12 mb)
Integrating GPS trajectory and topics from Twitter stream sor human mobility estimation


  1. 1.
    Zheng Y, Capra L, Wolfson O, Yang H. Urban computing: concepts, methodologies, and applications. ACM Transactions on Intelligent Systems and Technology, 2014, 5(3): 38Google Scholar
  2. 2.
    Zhang D, Wang Z, Guo B, Yu Z. Social and community intelligence: technologies and trends. IEEE Software, 2012, 29(4): 88–92CrossRefGoogle Scholar
  3. 3.
    Xiong Z, Zheng Y, Li C. Data vitalization’s perspective towards smart city: a reference model for data service oriented architecture. In: Proceedings of the 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing. 2014, 865–874Google Scholar
  4. 4.
    Calabrese F, Diao M, Lorenzo G D, Ferreira J, Ratti C. Understanding individual mobility patterns from urban sensing data: a mobile phone trace example. Transportation Research Part C: Emerging Technologies, 2013, 26: 301–313CrossRefGoogle Scholar
  5. 5.
    Kang C, Ma X, Tong D, Liu Y. Intra-urban human mobility patterns: an urban morphology perspective. Physica A: Statistical Mechanics and its Applications, 2012, 391(4): 1702–1717CrossRefGoogle Scholar
  6. 6.
    Song X, Zhang Q, Sekimoto Y, Shibasaki R. Prediction of human emergency behavior and their mobility following large-scale disaster. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2014, 5–14Google Scholar
  7. 7.
    Zhai Z, Liu B, Wang J, Xu H, Jia P. Product feature grouping for opinion mining. IEEE Intelligent Systems, 2012, 27(4): 37–44CrossRefGoogle Scholar
  8. 8.
    Kim Y, Han J, Yuan C. TOPTRAC: topical trajectory pattern mining. In: Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2015, 587–596CrossRefGoogle Scholar
  9. 9.
    Cheng T, Wicks T. Event detection using Twitter: a spatio-temporal approach. PLoS One, 2014, 9(6): e97807CrossRefGoogle Scholar
  10. 10.
    Grinberger Y, Shoval N. A temporal-contextual analysis of urban dynamics using location-based data. International Journal of Geographical Information Science, 2015, 29(11): 1969–1987.CrossRefGoogle Scholar
  11. 11.
    Spaccapietra S, Parent C, Damiani M L, De Macedo J A, Porto F,Vangenot C. A conceptual view on trajectories. Data & knowledge engineering, 2008, 65(1): 126–146.CrossRefGoogle Scholar
  12. 12.
    Sekimoto Y, Shibasaki R, Kanasugi H, Usui T, Shimazaki Y. PFlow: reconstructing people flow recycling large-scale social survey data. IEEE Pervasive Computing, 2011, 10(4), 27–35CrossRefGoogle Scholar
  13. 13.
    Wang J, Gu Q, Wu J, Liu G, Xiong Z. Traffic speed prediction and congestion source exploration: a deep learning method. In: Proceedings of the 16th IEEE International Conference on Data Mining. 2016, 499–508Google Scholar
  14. 14.
    Wang J, Gao F, Cui P, Li C, Xiong Z. Discovering urban spatiotemporal structure from time-evolving traffic networks. In: Proceedings of the 16th Asia-Pacific Web Conference. 2014, 93–104Google Scholar
  15. 15.
    Dong W, Wang Y, Yu H. An identification model of urban critical links with macroscopic fundamental diagram theory. Frontiers of Computer Science, 2017, 11(1): 27–37MathSciNetCrossRefGoogle Scholar
  16. 16.
    Chen L, Ma X, Pan G, Jakubowicz J. Understanding bike trip patterns leveraging bike sharing system open data. Frontiers of Computer Science, 2017, 11(1): 38–48CrossRefGoogle Scholar
  17. 17.
    Wang J, Wang Y, Zhang D, Wang L, Chen C, Lee J W, He Y. Realtime and generic queue time estimation based on mobile crowdsensing. Frontiers of Computer Science, 2017, 11(1): 49–60CrossRefGoogle Scholar
  18. 18.
    Chen C, Chen X, Wang Z, Wang Y, Zhang D. Scenicplanner: planning scenic travel routes leveraging heterogeneous user-generated digital footprints. Frontiers of Computer Science, 2017, 11(1): 61–74CrossRefGoogle Scholar
  19. 19.
    Song X, Zhang Q, Sekimoto Y, Horanont T, Ueyama S, Shibasaki R. Modeling and probabilistic reasoning of population evacuation during large-scale disaster. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2013, 1231–1239CrossRefGoogle Scholar
  20. 20.
    Wang J, Chen C, Wu J, Xiong Z. No longer sleeping with a bomb: A duet system for protecting urban safety from dangerous goods. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2017, 1673–1681CrossRefGoogle Scholar
  21. 21.
    Wang J, Lin Y, Wu J,Wang Z, Xiong Z. Coupling implicit and explicit knowledge for customer volume prediction. In: Proceedings of the 31st AAAI Conference on Artificial Intelligence. 2017, 1569–1575Google Scholar
  22. 22.
    Fan Z, Song X, Shibasaki R. CitySpectrum: anon-negative tensor factorization approach. In: Proceedings of ACM International Joint Conference on Pervasive and Ubiquitous Computing. 2014, 213–223Google Scholar
  23. 23.
    Yuan J, Zheng Y, Xie X. Discovering regions of different functions in a city using human mobility and POIs. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2012, 186–194Google Scholar
  24. 24.
    Guo B, Wang Z, Yu Z, Wang Y, Yen N Y, Huang R, Zhou, X. Mobile crowd sensing and computing: the review of an emerging humanpowered sensing paradigm. ACM Computing Surveys, 2015, 48(1): 7CrossRefGoogle Scholar
  25. 25.
    Morstatter F, Pfeffer J, Liu H, Carley K M. Is the Sample Good Enough? Comparing data from Twitter’s streaming API with Twitter’s firehose. In: Proceedings of ICWSM. 2013, 400–408Google Scholar
  26. 26.
    Steiger E, De Albuquerque J P, Zipf A. An advanced systematic literature review on spatiotemporal analyses of Twitter data. Transactions in GIS, 2015, 19(6): 809–834CrossRefGoogle Scholar
  27. 27.
    Cheng T, Wicks T. Event detection using Twitter: a spatio-temporal approach. PLoS One, 2014 9(6), e97807CrossRefGoogle Scholar
  28. 28.
    Sakaki T, Okazaki M, Matsuo Y. Tweet analysis for real-time event detection and earthquake reporting system development. IEEE Transactions on Knowledge and Data Engineering, 2013, 25(4): 919–931CrossRefGoogle Scholar
  29. 29.
    Abbasi A, Rashidi T H, Maghrebi M, Waller S T. Utilising location based social media in travel survey methods: bringing Twitter data into the play. In: Proceedings of the 8th ACM SIGSPATIAL International Workshop on Location-Based Social Networks. 2015, 1–9Google Scholar
  30. 30.
    Ao J, Zhang P, Cao Y. Estimating the locations of emergency events from Twitter streams. Procedia Computer Science, 2014, 31: 731–739CrossRefGoogle Scholar
  31. 31.
    Cameron M A, Power R, Robinson B, Yin J. Emergency situation awareness from twitter for crisis management. In: Proceedings of the 21st International Conference on World Wide Web. 2012, 695–698Google Scholar
  32. 32.
    Frias-Martinez V, Frias-Martinez E. Spectral clustering for sensing urban land use using Twitter activity. Engineering Applications of Artificial Intelligence, 2014, 35: 237–245CrossRefGoogle Scholar
  33. 33.
    Jurdak R, Zhao K, Liu J, AbouJaoude M, Cameron M, Newth D. Understanding human mobility from Twitter. PLoS One, 2015, 10(7): e0131469CrossRefGoogle Scholar
  34. 34.
    Blanford J I, Huang Z, Savelyev A, MacEachren A M. Geo-located tweets. Enhancing mobility maps and capturing cross-border movement. PLoS One, 2015, 10(6): e0129202CrossRefGoogle Scholar
  35. 35.
    Hawelka B, Sitko I, Beinat E, Sobolevsky S, Kazakopoulos P, Ratti C. Geo-located Twitter as proxy for global mobility patterns. Cartography and Geographic Information Science, 2014, 41(3): 260–271CrossRefGoogle Scholar
  36. 36.
    Pan B, Zheng Y,Wilkie D, Shahabi C. Crowd sensing of traffic anomalies based on human mobility and social media. In: Proceedings of the 21st ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems. 2013, 334–343Google Scholar
  37. 37.
    Řehuřek R. Subspace tracking for latent semantic analysis. In: Proceedings of the 33rd European Conference on Advances in Information Retrieval. 2011, 289–300Google Scholar
  38. 38.
    Blei D M, Ng A Y, Jordan M I. Latent dirichlet allocation. Journal of machine Learning research, 2003, 3(4-5), 993–1022.zbMATHGoogle Scholar
  39. 39.
    Hoffman M D, Blei D M, Bach F. Online learning for latent dirichlet allocation. In: Proceedings of the Neural Information Processing Systems Conference. 2010, 856–864Google Scholar
  40. 40.
    Zheng Y. Methodologies for cross-domain data fusion: an overview. IEEE Transactions on Big Data, 2015, 1(1): 16–34.CrossRefGoogle Scholar
  41. 41.
    Wang J, He X,Wang Z,Wu J W, Yuan N J, Xie X, Xiong Z. CD-CNN: a partially supervised cross-domain deep learning model for urban resident recognition. In: Proceedings of the 32nd AAAI Conference on Artificial Intelligence. 2018Google Scholar

Copyright information

© Higher Education Press and Springer-Verlag GmbH Germany, part of Springer Nature 2018

Authors and Affiliations

  • Satoshi Miyazawa
    • 1
    Email author
  • Xuan Song
    • 2
  • Tianqi Xia
    • 1
  • Ryosuke Shibasaki
    • 2
  • Hodaka Kaneda
    • 3
  1. 1.Department of Socio-Cultural Environmental Studies, Graduate School of Frontier SciencesThe University of TokyoChibaJapan
  2. 2.Center for Spatial Information ScienceThe University of TokyoKashiwaJapan
  3. 3.Zenrin DataCom Co’LtdTokyoJapan

Personalised recommendations