Skip to main content
Log in

Inferring tweet location inference for twitter mining

  • Published:
Spatial Information Research Aims and scope Submit manuscript

Abstract

A tweet, possessing various facets, is created at the speed of thought, propagated in real time and produces social interchange on an international scale. As a result, users demand the analysis of twitter mining with a map to search for trendy topics or find what is being talked about among users. Due to the sparsity of location information, however, there are real difficulties in analysis related to position information. To run Twitter mining on all Korean users, this study used firehose level, which is massive 100 % twitter data, while utilizing a new spatial indicator to overcome the sparsity of location information. Furthermore, the study suggested an algorithm to process firehose data and solutions to overcome the study’s limit. The conventional method of using spritzer level data and the supervised method resulted in 44 times more positions inferred on a tweet than the method using geotag, whereas the method used in this study saw inferences rise 680 fold. In the case of the clustering algorithm, the method of K-Center Clustering was found to have inferred the most number of user residential locations. The ultimate goal of the study is for the twitter data, including the massive volume of location information inferred and created in real time, to serve as a means of city monitoring by overcoming the study’s limit, which is automated refining of unnecessary words for profile location information and twitter mining.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

Notes

  1. Data that streams in real time or changes into a dynamic state.

  2. Searching for topics prevalent in tweets, or finding out what is being talked about among people [2].

  3. An indicator to show location value on a twitter [6].

  4. JSON (JavaScript Object Notation) is a form of data exchange. This form makes people it easy to read and write. Besides, this form facilitates an analysis and configuration by machines.

References

  1. Lee, B. Y., Lim, J. T., & Yoo, J. (2013). Utilization of social media analysis using big data. The Journal of the Korea Contents Association, 13(2), 211–219.

    Article  Google Scholar 

  2. Russell, M. A. (2013). Mining the social web: data mining Facebook, Twitter, LinkedIn, Google+, GitHub, and More (2nd ed.). Sebastopol: O’Reilly Media.

    Google Scholar 

  3. Guo, J., Zhang, P., & Guo, L. (2012). Mining hot topics from Twitter streams. Procedia Computer Science, 9, 2008–2011.

    Article  Google Scholar 

  4. Bifet, A. (2013). Mining big data in real time. Informatica, 37, 15–20.

    Google Scholar 

  5. Kim, M. G., & Koh, J. H. (2016). Recent research trends for geospatial information explored by Twitter data. Spatial Information Research, 24(2), 65–73. doi:10.1007/s41324-016-0007-0.

    Article  Google Scholar 

  6. Ajao, O., Hong, J., & Liu, W. (2015). A survey of location inference techniques on Twitter. Journal of Information Science, 41(6), 855–864.

    Article  Google Scholar 

  7. Blanford, J., Huang, Z., Savelyev, A., & MacEachren, A. M. (2015). Geo-located tweets. Enhancing mobility maps and capturing cross-border movement. PLoS One, 10(6), e0129202.

    Article  Google Scholar 

  8. Dredze, M., Paul, M. J., Bergsma, S., & Tran, H. (2013). Carmen: A twitter geolocation system with applications to public health. In AAAI workshop on expanding the boundaries of health informatics using AI(HIAI) (pp 20–24).

  9. Nelson, J. K., Quinn, S., Swedberg, B., Chu, W., & MacEachren, A. M. (2015). Geovisual analytics approach to exploring public political discourse on Twitter. ISPRS International Journal of Geo-Information, 4(1), 337–366.

    Article  Google Scholar 

  10. Tweetping Website. https://www.tweetping.net. Accessed 1 April 2016.

  11. LIVE Singapore Website. http://senseable.mit.edu/livesingapore/index.html. Accessed 1 April 2016.

  12. SK Telecom Smart Insight Webpage. http://www.smartinsight.co.kr. Accessed 1 April 2016.

  13. Morstatter, F., Pfeffer, J., Liu, H., & Carley, K. M. (2013). Is the sample good enough? Comparing data from twitter’s streaming api with twitter’s firehose. In Proceedings of ICWSM.

  14. Luo, F., Cao, G., Mulligan, K., & Li, X. (2015). Explore spatiotemporal and demographic characteristics of human mobility via twitter: A case study of Chicago. arXiv preprint arXiv:1508.00188.

  15. Frias-Martinez, V., Sae-Tang, A., & Frias-Martinez, E. (2014). To call, or to tweet? Understanding 3-1-1 citizen complaint behaviors. In SocialCom 2014: The sixth IEEE/ASE international conference on social computing. http://galaxy.cs.lamar.edu/~kmakki/2014-ASE/2014%20ASE%20Conference%20Stanford%20University%20Proceedings/Proceedings.pdf

  16. Zhang, J., Sun, J., Zhang, R., & Zhang, Y. (2015). Your actions tell where you are: Uncovering Twitter users in a metropolitan area. In IEEE Conference on Communications and Network Security (CNS), 2015 (pp. 424–432).

  17. Yim, J. Y., Ha, H. S., & Hwang, B. Y. (2015). A method for detecting event location based on similar keyword extraction in tweet text. Journal of Korea Spatial Information Society, 23(5), 1–7.

    Article  Google Scholar 

  18. Gonzalez, R., Figueroa, G., & Chen, Y. S. (2012). Tweolocator: a non-intrusive geographical locator system for twitter. In Proceedings of the 5th ACM SIGSPATIAL international workshop on location-based social networks (pp. 24–31).

  19. Kotzias, D., Lappas, T., & Gunopulos, D. (2014). Addressing the Sparsity of Location Information on Twitter. In EDBT/ICDT Workshops (pp. 339–346).

  20. Valkanas, G., & Gunopulos, D. (2012). Location extraction from social networks with commodity software and online data. In IEEE 12th international conference on data mining workshops (ICDMW), 2012 (pp. 827–834).

  21. Lim, H. J., & Park, S. H. (2015). A tentative approach for regional futures strategy with big data. The Korean Cadastre Information Association, 17(1), 75–90.

    Google Scholar 

  22. Park, W. J., & Yu, K. Y. (2015). Spatial clustering analysis based on text mining of location based social media data. Journal of the Korean Society for Geospatial Information Science, 23(2), 89–96.

    Article  Google Scholar 

  23. Kang, A. T., & Kang, Y. O. (2015). Location inference of Twitter users using timeline data. Journal of Korea Spatial Information Society, 23(2), 69–81.

    Article  Google Scholar 

  24. Han, S. G. (2014). Social media. Melbourne: Acorn Publication.

    Google Scholar 

  25. Li, R., Wang, S., Deng, H., Wang, R., & Chang, K. C. (2012). Towards social user profiling: Unified and discriminative influence model for inferring home locations. In Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 1023–1031).

  26. Abdelhaq, H., Gertz, M., & Sengstock, C. (2013). Spatio-temporal characteristics of bursty words in Twitter streams. In Proceedings of the 21st ACM SIGSPATIAL international conference on advances in geographic information systems (pp. 194–203).

  27. 120 Dasan Seoul Call Center Webpage. http://120dasan.seoul.go.kr/foreign/english.html. Accessed 1 April 2016.

  28. Seoul Smart Report Application. https://play.google.com/store/apps/details?id=kr.go.seoul.seoulSmartReport&hl=ko. Accessed 1 April 2016.

  29. K-Center Clustering. http://trendsofcode.net

  30. DBSCAN. http://slideplayer.com/slide/4239151

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to June Hwan Koh.

Additional information

This article is a condensed form of the first author’s Ph.D. thesis from University of Seoul.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kim, M.G., Kang, Y.O., Lee, J.Y. et al. Inferring tweet location inference for twitter mining. Spat. Inf. Res. 24, 421–435 (2016). https://doi.org/10.1007/s41324-016-0041-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s41324-016-0041-y

Keywords

Navigation