A multilayer recognition model for twitter user geolocation

Abstract

Geolocation is important for many emerging applications such as disaster management and recommendation system. In this paper, we propose a multilayer recognition model (MRM) to predict the city-level location for social network users, solely based on the user’s tweet content. Through a series of optimizations such as entity selection, spatial clustering and outlier filtering, suitable features are extracted to model the geographic coordinates of tweet users. Then, the Multinomial Naive Bayes is applied to classify the datasets into different groups. The model is evaluated by comparing with an existing algorithm on twitter datasets. The experimental results reveal that our method achieves a better prediction accuracy of 54.82% on the test set, and the average error is reduced to 400.97 miles at best.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3

References

  1. 1.

    Middleton, S. E., Middleton, L., & Modafferi, S. (2014). Real-time crisis mapping of natural disasters using social media. IEEE Intelligent Systems, 29(2), 9–17.

    Article  Google Scholar 

  2. 2.

    Rahimi, A., Cohn, T., & Baldwin, T. (2015). Twitter user geolocation using a unified text and network prediction model. Computer Science, 66(4), 568–578.

    Google Scholar 

  3. 3.

    Bakerman, J., Pazdernik, K., Wilson, A., et al. (2018). Twitter geolocation: A hybrid approach. Acm Transactions on Knowledge Discovery from Data, 12(3), 1–17.

    Article  Google Scholar 

  4. 4.

    Wang, W., & Street, W. N. (2016). Finding hierarchical communities in complex networks using influence-guided label propagation. In IEEE international conference on data mining workshop (pp. 547–556).

  5. 5.

    Ebrahimi, M., Shafieibavani, E., Wong, R., et al. (2018). Twitter user geolocation by filtering of highly mentioned users. Journal of the Association for Information Science & Technology. https://doi.org/10.1002/asi.24011.

    Google Scholar 

  6. 6.

    Jia, H. C., & Ratnavelu, K. (2016). Detecting community structure by using a constrained label propagation algorithm. PLoS ONE, 11(5), e0155320.

    Article  Google Scholar 

  7. 7.

    Wang, F., Lu, C. T., Qu, Y., & Yu, P. S. (2017). Collective geographical embedding for geolocating social network users. In Pacific-asia conference on knowledge discovery and data mining (pp. 599–611). Cham: Springer.

    Google Scholar 

  8. 8.

    Serdyukov, P., Murdock, V., & Zwol, R. V. (2009). Placing flickr photos on a map. In International ACM SIGIR conference on research and development in information retrieval (pp. 484–491).

  9. 9.

    Iso, H., Wakamiya, S., & Aramaki, E. (2017). Density estimation for geolocation via convolutional mixture density network. arXiv:1705.02750.

  10. 10.

    Ajao, O., Hong, J., & Liu, W. (2015). A survey of location inference techniques on Twitter. Journal of Information Science, 41(6), 855–864.

    Article  Google Scholar 

  11. 11.

    Lourentzou, I., Morales, A., & Zhai, C. X. (2018). Text-based geolocation prediction of social media users with neural networks. In IEEE international conference on big data (pp. 696–705).

  12. 12.

    Li, C., Wang, H., Zhang, Z., et al. (2016). Topic modeling for short texts with auxiliary word embeddings. In International ACM SIGIR conference on research & development in information retrieval (pp. 165–174).

  13. 13.

    Chandra, S., Khan, L., & Muhaya, F. B. (2012). Estimating twitter user location using social interactions—A content based approach. In IEEE third international conference on privacy, security, risk and trust (pp. 838–843).

  14. 14.

    Jurgens, D. (2013). That’s what friends are for: Inferring location in online social media platforms based on social relationships. In Proceedings of the international conference on web and social media (ICWSM’13) (Vol. 13, no 13, pp. 273–282).

  15. 15.

    Xing, Y., Meng, F., Zhou, Y., et al. (2014). A node influence based label propagation algorithm for community detection in networks. The Scientific World Journal, 2014(5), 627581.

    Google Scholar 

  16. 16.

    Paradesi, S. M. (2011). Geotagging tweets using their content. In Twenty-fourth international Florida artificial intelligence research society conference, Palm Beach, Florida, USA. DBLP.

  17. 17.

    Cheng, Z., Caverlee, J., & Lee, K. (2010). You are where you tweet: a content-based approach to geo-locating twitter users. CIKM’10, 19(4), 759–768.

    Google Scholar 

  18. 18.

    Chang, H. W., Lee, D., Eltaher, M., et al. (2012). @Phillies tweeting from philly? Predicting twitter user locations with spatial word usage. In IEEE/ACM international conference on advances in social networks analysis and mining (pp. 111–118).

  19. 19.

    Rahimi, A., Vu, D., Cohn, T., & Baldwin, T. (2015). Exploiting text and network context for geolocation of social media users. In NAACL-HLT 2015.

  20. 20.

    Uncu, O., Gruver, W. A., Kotak, D. B., et al. (2007). GRIDBSCAN: GRId density-based spatial clustering of applications with noise. In IEEE international conference on systems, man and cybernetics (pp. 2976–2981). IEEE.

  21. 21.

    Finkel, J. R., Grenager, T., & Manning, C. (2005). Incorporating non-local information into information extraction systems by Gibbs sampling. In Meeting on association for computational linguistics (pp. 363–370).

Download references

Acknowledgements

This work was partially supported by the China National Science and Technology Major Project (2017ZX03001015, 2018ZX03001015, and 2018ZX03001021). Furthermore, this work is done also with the support of the Chinese Academy of Sciences project under Grant No. CXJJ-16M119.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Haina Tang.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Tang, H., Zhao, X. & Ren, Y. A multilayer recognition model for twitter user geolocation. Wireless Netw (2019). https://doi.org/10.1007/s11276-018-01897-1

Download citation

Keywords

  • Twitter
  • Geolocation
  • Spatial clustering
  • Text classification