Understanding urban dynamics and large-scale human mobility will play a vital role in building smart cities and sustainable urbanization. Existing research in this domain mainly focuses on a single data source (e.g., GPS data, CDR data, etc.). In this study, we collect big and heterogeneous data and aim to investigate and discover the relationship between spatiotemporal topics found in geo-tagged tweets and GPS traces from smartphones. We employ Latent Dirichlet Allocation-based topic modeling on geo-tagged tweets to extract and classify the topics. Then the extracted topics from tweets and temporal population distribution from GPS traces are jointly used to model urban dynamics and human crowd flow. The experimental results and validations demonstrate the efficiency of our approach and suggest that the fusion of cross-domain data for urban dynamics modeling is more practical than previously thought.
This is a preview of subscription content, log in to check access.
Buy single article
Instant access to the full article PDF.
Price includes VAT for USA
Zheng Y, Capra L, Wolfson O, Yang H. Urban computing: concepts, methodologies, and applications. ACM Transactions on Intelligent Systems and Technology, 2014, 5(3): 38
Zhang D, Wang Z, Guo B, Yu Z. Social and community intelligence: technologies and trends. IEEE Software, 2012, 29(4): 88–92
Xiong Z, Zheng Y, Li C. Data vitalization’s perspective towards smart city: a reference model for data service oriented architecture. In: Proceedings of the 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing. 2014, 865–874
Calabrese F, Diao M, Lorenzo G D, Ferreira J, Ratti C. Understanding individual mobility patterns from urban sensing data: a mobile phone trace example. Transportation Research Part C: Emerging Technologies, 2013, 26: 301–313
Kang C, Ma X, Tong D, Liu Y. Intra-urban human mobility patterns: an urban morphology perspective. Physica A: Statistical Mechanics and its Applications, 2012, 391(4): 1702–1717
Song X, Zhang Q, Sekimoto Y, Shibasaki R. Prediction of human emergency behavior and their mobility following large-scale disaster. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2014, 5–14
Zhai Z, Liu B, Wang J, Xu H, Jia P. Product feature grouping for opinion mining. IEEE Intelligent Systems, 2012, 27(4): 37–44
Kim Y, Han J, Yuan C. TOPTRAC: topical trajectory pattern mining. In: Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2015, 587–596
Cheng T, Wicks T. Event detection using Twitter: a spatio-temporal approach. PLoS One, 2014, 9(6): e97807
Grinberger Y, Shoval N. A temporal-contextual analysis of urban dynamics using location-based data. International Journal of Geographical Information Science, 2015, 29(11): 1969–1987.
Spaccapietra S, Parent C, Damiani M L, De Macedo J A, Porto F, Vangenot C. A conceptual view on trajectories. Data & knowledge engineering, 2008, 65(1): 126–146.
Sekimoto Y, Shibasaki R, Kanasugi H, Usui T, Shimazaki Y. PFlow: reconstructing people flow recycling large-scale social survey data. IEEE Pervasive Computing, 2011, 10(4), 27–35
Wang J, Gu Q, Wu J, Liu G, Xiong Z. Traffic speed prediction and congestion source exploration: a deep learning method. In: Proceedings of the 16th IEEE International Conference on Data Mining. 2016, 499–508
Wang J, Gao F, Cui P, Li C, Xiong Z. Discovering urban spatiotemporal structure from time-evolving traffic networks. In: Proceedings of the 16th Asia-Pacific Web Conference. 2014, 93–104
Dong W, Wang Y, Yu H. An identification model of urban critical links with macroscopic fundamental diagram theory. Frontiers of Computer Science, 2017, 11(1): 27–37
Chen L, Ma X, Pan G, Jakubowicz J. Understanding bike trip patterns leveraging bike sharing system open data. Frontiers of Computer Science, 2017, 11(1): 38–48
Wang J, Wang Y, Zhang D, Wang L, Chen C, Lee J W, He Y. Realtime and generic queue time estimation based on mobile crowdsensing. Frontiers of Computer Science, 2017, 11(1): 49–60
Chen C, Chen X, Wang Z, Wang Y, Zhang D. Scenicplanner: planning scenic travel routes leveraging heterogeneous user-generated digital footprints. Frontiers of Computer Science, 2017, 11(1): 61–74
Song X, Zhang Q, Sekimoto Y, Horanont T, Ueyama S, Shibasaki R. Modeling and probabilistic reasoning of population evacuation during large-scale disaster. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2013, 1231–1239
Wang J, Chen C, Wu J, Xiong Z. No longer sleeping with a bomb: A duet system for protecting urban safety from dangerous goods. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2017, 1673–1681
Wang J, Lin Y, Wu J, Wang Z, Xiong Z. Coupling implicit and explicit knowledge for customer volume prediction. In: Proceedings of the 31st AAAI Conference on Artificial Intelligence. 2017, 1569–1575
Fan Z, Song X, Shibasaki R. CitySpectrum: anon-negative tensor factorization approach. In: Proceedings of ACM International Joint Conference on Pervasive and Ubiquitous Computing. 2014, 213–223
Yuan J, Zheng Y, Xie X. Discovering regions of different functions in a city using human mobility and POIs. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2012, 186–194
Guo B, Wang Z, Yu Z, Wang Y, Yen N Y, Huang R, Zhou, X. Mobile crowd sensing and computing: the review of an emerging humanpowered sensing paradigm. ACM Computing Surveys, 2015, 48(1): 7
Morstatter F, Pfeffer J, Liu H, Carley K M. Is the Sample Good Enough? Comparing data from Twitter’s streaming API with Twitter’s firehose. In: Proceedings of ICWSM. 2013, 400–408
Steiger E, De Albuquerque J P, Zipf A. An advanced systematic literature review on spatiotemporal analyses of Twitter data. Transactions in GIS, 2015, 19(6): 809–834
Cheng T, Wicks T. Event detection using Twitter: a spatio-temporal approach. PLoS One, 2014 9(6), e97807
Sakaki T, Okazaki M, Matsuo Y. Tweet analysis for real-time event detection and earthquake reporting system development. IEEE Transactions on Knowledge and Data Engineering, 2013, 25(4): 919–931
Abbasi A, Rashidi T H, Maghrebi M, Waller S T. Utilising location based social media in travel survey methods: bringing Twitter data into the play. In: Proceedings of the 8th ACM SIGSPATIAL International Workshop on Location-Based Social Networks. 2015, 1–9
Ao J, Zhang P, Cao Y. Estimating the locations of emergency events from Twitter streams. Procedia Computer Science, 2014, 31: 731–739
Cameron M A, Power R, Robinson B, Yin J. Emergency situation awareness from twitter for crisis management. In: Proceedings of the 21st International Conference on World Wide Web. 2012, 695–698
Frias-Martinez V, Frias-Martinez E. Spectral clustering for sensing urban land use using Twitter activity. Engineering Applications of Artificial Intelligence, 2014, 35: 237–245
Jurdak R, Zhao K, Liu J, AbouJaoude M, Cameron M, Newth D. Understanding human mobility from Twitter. PLoS One, 2015, 10(7): e0131469
Blanford J I, Huang Z, Savelyev A, MacEachren A M. Geo-located tweets. Enhancing mobility maps and capturing cross-border movement. PLoS One, 2015, 10(6): e0129202
Hawelka B, Sitko I, Beinat E, Sobolevsky S, Kazakopoulos P, Ratti C. Geo-located Twitter as proxy for global mobility patterns. Cartography and Geographic Information Science, 2014, 41(3): 260–271
Pan B, Zheng Y, Wilkie D, Shahabi C. Crowd sensing of traffic anomalies based on human mobility and social media. In: Proceedings of the 21st ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems. 2013, 334–343
Řehuřek R. Subspace tracking for latent semantic analysis. In: Proceedings of the 33rd European Conference on Advances in Information Retrieval. 2011, 289–300
Blei D M, Ng A Y, Jordan M I. Latent dirichlet allocation. Journal of machine Learning research, 2003, 3(4-5), 993–1022.
Hoffman M D, Blei D M, Bach F. Online learning for latent dirichlet allocation. In: Proceedings of the Neural Information Processing Systems Conference. 2010, 856–864
Zheng Y. Methodologies for cross-domain data fusion: an overview. IEEE Transactions on Big Data, 2015, 1(1): 16–34.
Wang J, He X, Wang Z, Wu J W, Yuan N J, Xie X, Xiong Z. CD-CNN: a partially supervised cross-domain deep learning model for urban resident recognition. In: Proceedings of the 32nd AAAI Conference on Artificial Intelligence. 2018
This work was partially supported by JST, Strategic International Collaborative Research Program (SICORP); Grant in-Aid for Scientific Research B (17H01784) and Grant in-Aid for Young Scientists (26730113) of Japan’s Ministry of Education, Culture, Sports, Science, and Technology (MEXT). We specially thank ZENRIN DataCom CO., LTD for the provision of GPS data and their support, and Nightley Inc. for geo-tagged tweets.
Satoshi Miyazawa is a PhD student of the Department of Socio-Cultural Environmental Studies at The University of Tokyo, Japan. His research interests include human mobility, LBSN, data mining, and machine learning.
Xuan Song received the BS degree in information engineering from the Jilin University, China in 2005 and PhD degree in signal and information processing from Peking University, China in 2010. From 2010 to 2012, he joined the Center for Spatial Information Science, The University of Tokyo, Japan as a postdoctoral researcher. In 2012 and 2015, he was promoted to project assistant professor and project associate professor at the same university. His research areas are mainly in artificial intelligence and data mining.
Tianqi Xia is a master student of the Department of Socio-Cultural Environmental Studies, The University of Tokyo, Japan. He received his BS degree in geographic information science from Wuhan University, China. His research interests include spatial data mining, data analysis and intelligent transportation systems.
Ryosuke Shibasaki is a professor at the Center for Spatial Information Science, The University of Tokyo, Japan. His research interests include satellite and airborne remote sensing, tracking technologies, geospatial information gathering and integration among heterogeneous systems, and common service platforms for geospatial information.
Hodaka Kaneda is an employee of ZENRIN-Datacom CO., LTD, Japan. His work is to deal with GPS data and to supply “Konzatsu-Tokei (R)” Data.
Electronic supplementary material
About this article
Cite this article
Miyazawa, S., Song, X., Xia, T. et al. Integrating GPS trajectory and topics from Twitter stream for human mobility estimation. Front. Comput. Sci. 13, 460–470 (2019). https://doi.org/10.1007/s11704-017-6464-3
- GPS trajectory
- human mobility
- location-based social network (LBSN)
- topic modeling
- data mining
- spatiotemporal topic