Skip to main content


Log in

Inferring Urban Land Use Using Large-Scale Social Media Check-in Data

  • Published:
Networks and Spatial Economics Aims and scope Submit manuscript


Emerging location-based services in social media tools such as Foursquare and Twitter are providing an unprecedented amount of public-generated data on human movements and activities. This novel data source contains valuable information (e.g., geo-location, time and date, type of places) on human activities. While the data is tremendously beneficial in modeling human activity patterns, it is also greatly useful in inferring planning related variables such as a city’s land use characteristics. This paper provides a comprehensive investigation on the possibility and validity of utilizing large-scale social media check-in data to infer land use types by applying the state-of-art data mining techniques. Two inference approaches are proposed and tested in this paper: the unsupervised clustering method and supervised learning method. The land use inference is conducted in a uniform grid level of 200 by 200 m. The methods are applied to a case study of New York City. The validation result confirms that the two approaches effectively infer different land use types given sufficient check-in data. The encouraging result demonstrates the potential of using social media check-in data in urban land use inference, and also reveals the hidden linkage between the human activity pattern and the underlying urban land use pattern.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others




  • Abonyi J, Feil B (2007) Cluster analysis for data mining and system identification. Springer, London

    Google Scholar 

  • Alelyani S, Tang J, Liu H (2013) Feature selection for clustering: A review. Data Clust Algorithm Appl, CRC Press

  • Balasko B, Abonyi J, Feil B (2005) Fuzzy clustering and data analysis toolbox.

  • Barnsley MJ, Barr SL (1996) Inferring urban land use from satellite sensor images using kernel-based spatial reclassification. Photogramm Eng Remote Sens 62(8):949–958

  • Bishop CM (2006) Pattern recognition and machine learning (Information Science and Statistics), 1st edn. Springer-Verlag New York, Inc, Secaucus

    Google Scholar 

  • Breiman L (2001) Random forests. Mach Learn 45(1):5–32

    Article  Google Scholar 

  • Cheng Z et al. (2011) Exploring millions of footprints in location sharing services. AAAI ICWSM, 2010(Cholera)

  • ComScore, Inc (2012) 2012 mobile future in focus. ComScore, Inc.

  • Davies D, Bouldin D (1979) A cluster separation measure. IEEE Trans Pattern Anal Mach Intell 1(2):224–227

    Article  Google Scholar 

  • Dunn JC (1973) A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters. J Cybernet 3(3):32–57

    Article  Google Scholar 

  • González MC, Hidalgo CA, Barabási A-L (2008) Understanding individual human mobility patterns. Nature 453(7196):779–782

    Article  Google Scholar 

  • Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The WEKA data mining software: an update. ACM SIGKDD Explor Newsl 11(1):10–18

    Article  Google Scholar 

  • Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning: data mining, inference, and prediction, (Springer Series in Statistics), 2nd edn. Springer, New York

    Book  Google Scholar 

  • He X, Cai D, Niyogi P (2006) Laplacian score for feature selection. Adv Neural Inf Process Syst 18:507

    Google Scholar 

  • Marchal F (2005) A trip generation method for time-dependent Large-Scale Simulations of Transport and Land-Use. Netw Spat Econ 5:179–192

    Article  Google Scholar 

  • Mesev V (1998) The use of census data in urban image classification. Photogramm Eng Remote Sens 5:431–438

    Google Scholar 

  • Moran MS, Inoue Y, Barnes EM (1997) Opportunities and limitations for image-based remote sensing in precision crop management. Remote Sens Environ 61(3):319–346

    Article  Google Scholar 

  • Müller E, Günnemann S, Assent I, Seidl T (2009) Evaluating clustering in subspace projections of high dimensional data. In Proc. 35th International Conference on Very Large Data Bases (VLDB 2009), Lyon, France

  • New York City Department of City Planning (NYCDCP) (2013) MapPluto.

  • Pfaffenbichler P, Emberger G, Shepherd S (2008) The integrated dynamic land use and transport model MARS. Netw Spat Econ 8(2–3):183–200

    Article  Google Scholar 

  • Qi G, Li X, Li S, Pan G, Wang Z, Zhang D (2011) Measuring social functions of city regions from large-scale taxi behaviors. In the proceeding of Ninth Annual IEEE International Conference on Pervasive Computing and Communications, PerCOM, 384–388

  • Ray S, Turi RH (1999) Determination of number of clusters in k-means clustering and application in colour image segmentation. In ICAPRDT

  • Schmit C, Rounsevell MDA, La Jeunesse I (2006) The limitations of spatial land use data in environmental analysis. Environ Sci Pol 9(2):174–188

    Article  Google Scholar 

  • Song C, Qu Z, Blumm N, Barabási A-L (2010) Limits of predictability in human mobility. Science (New York, NY) 327(5968):1018–1021

    Article  Google Scholar 

  • Soto V, Frias-Martinez E (2011a) Robust land use characterization of urban landscapes using cell phone data. In 1st Workshop on Pervasive Urban Applications, in conjunction with 9th Int. Conf. Pervasive Computing, June 2011

  • Soto V, Frías-Martínez E (2011b) Automated land use identification using cell-phone records. In Proceedings of the 3rd ACM International Workshop on MobiArch - HotPlanet’11, 17. ACM Press, New York

  • Sun H, Forsythe W, Waters N (2007) Modeling urban land use change and Urban Sprawl: Calgary, Alberta, Canada. Netw Spat Econ 7(4):353–376

    Article  Google Scholar 

  • Toole JL, Ulm M, González MC, Bauer D (2012) Inferring land use from mobile phone activity. In Proceedings of the ACM SIGKDD International Workshop on Urban Computing - UrbComp’12, 1. ACM Press, New York

  • Winkler R, Klawonn F, Kruse R (2011) Fuzzy c-means in high dimensional spaces. Int J Fuzzy Syst Appl (IJFSA) 1(1):1–16

  • Xie XL, Beni G (1991) A validity measure for fuzzy clustering. IEEE Trans pattern anal mach intell 13(8):841–847

  • Yang X, Lo CP (2002) Using a time series of satellite imagery to detect land use and land cover changes in the Atlanta, Georgia Metropolitan Area. Int J Remote Sens 23(9):1775–1798

    Article  Google Scholar 

  • Yuan J, Yu Z, Xing X (2012) Discovering regions of different functions in a City using human mobility and POIs. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD’12, 186. ACM Press, New York

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to Satish V. Ukkusuri.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhan, X., Ukkusuri, S.V. & Zhu, F. Inferring Urban Land Use Using Large-Scale Social Media Check-in Data. Netw Spat Econ 14, 647–667 (2014).

Download citation

  • Published:

  • Issue Date:

  • DOI: