Skip to main content
Log in

Clustering daily patterns of human activities in the city

  • Published:
Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Abstract

Data mining and statistical learning techniques are powerful analysis tools yet to be incorporated in the domain of urban studies and transportation research. In this work, we analyze an activity-based travel survey conducted in the Chicago metropolitan area over a demographic representative sample of its population. Detailed data on activities by time of day were collected from more than 30,000 individuals (and 10,552 households) who participated in a 1-day or 2-day survey implemented from January 2007 to February 2008. We examine this large-scale data in order to explore three critical issues: (1) the inherent daily activity structure of individuals in a metropolitan area, (2) the variation of individual daily activities—how they grow and fade over time, and (3) clusters of individual behaviors and the revelation of their related socio-demographic information. We find that the population can be clustered into 8 and 7 representative groups according to their activities during weekdays and weekends, respectively. Our results enrich the traditional divisions consisting of only three groups (workers, students and non-workers) and provide clusters based on activities of different time of day. The generated clusters combined with social demographic information provide a new perspective for urban and transportation planning as well as for emergency response and spreading dynamics, by addressing when, where, and how individuals interact with places in metropolitan areas.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Axhausen KW, Zimmermann A, Schönfelder S, Rindsfüser G, Haupt T (2002) Observing the rhythms of daily life: a six-week travel diary. Transportation 29(2): 95–124. doi:10.1023/a:1014247822322

    Article  Google Scholar 

  • Balcan D, Colizza V, Gonçalves B, Hu H, Ramasco JJ, Vespignani A (2009) Multiscale mobility networks and the spatial spreading of infectious diseases. Proc Natl Acad Sci USA 106(51): 21484–21489. doi:10.1073/pnas.0906910106

    Article  Google Scholar 

  • Balmer M, Axhausen KW, Nagel K (1985) Agent-based demand-modeling framework for large-scale microsimulations. vol 1985. National Research Council, Washington, DC, ETATS-UNIS

  • Batty M (2005) Cities and complexity: understanding cities with cellular automata, agent-based models, and fractals. The MIT press, Cambridge

    Google Scholar 

  • Becker GS (1965) A theory of the allocation of time. Econ J 75(299): 493–517

    Article  Google Scholar 

  • Becker GS (1977) The economic approach to human behavior. University of Chicago Press, Chicago

    Google Scholar 

  • Becker GS (1991) A treatise on the family. Harvard University Press, Cambridge

    Google Scholar 

  • Bekhor S, Dobler C, Axhausen KW (2011) Integration of activity-based with agent-based models: an example from the tel aviv model and MATSim. In: Transportation Research Board 90th Annual Meeting, Washington DC

  • Ben-Akiva M, Bowman JL (1998) Integration of an activity-based model system and a residential location model. Urban Stud 35(7): 1131–1153. doi:10.1080/0042098984529

    Article  Google Scholar 

  • Bhat CR, Koppelman FS (1999) A retrospective and prospective survey of time-use research. Transportation 26(2): 119–139. doi:10.1023/a:1005196331393

    Article  Google Scholar 

  • Bishop CM (2009) Pattern recognition and machine learning. Springer, New York

    Google Scholar 

  • Bowman JL, Ben-Akiva M (2001) Activity-based disaggregate travel demand model system with activity schedules. Transp Res Part A Policy Pract 35(1): 1–28

    Article  Google Scholar 

  • Brun M, Sima C, Hua J, Lowey J, Carroll B, Suh E, Dougherty ER (2007) Model-based evaluation of clustering validation measures. Pattern Recognit 40(3): 807–824

    Article  MATH  Google Scholar 

  • Calabrese F, Reades J, Ratti C (2010) Eigenplaces: segmenting space through digital signatures. vol 9

  • Candia J, González MC, Wang P, Schoenharl T, Madey G, Barabási A-L (2008) Uncovering individual and collective human dynamics from mobile phone records. J Phys A Math Theor 41(22): 224015

    Article  Google Scholar 

  • Chapin FS (1974) Human activity patterns in the city: things people do in time and in space. Wiley, New York

    Google Scholar 

  • Chicago Travel Tracker Household Travel Inventory (2008) http://www.cmap.illinois.gov/travel-tracker-survey

  • Crane R, Sornette D (2008) Robust dynamic classes revealed by measuring the response function of a social system. Proc Natl Acad Sci 105(41): 15649–15653. doi:10.1073/pnas.0803685105

    Article  Google Scholar 

  • Ding C, He X (2004) K-means clustering via principal component analysis. Paper presented at the Proceedings of the twenty-first international conference on Machine learning, Banff, Alberta, Canada

  • Duda RO, Hart PE, Stork DG (2001) Pattern classification. Wiley, New York

    MATH  Google Scholar 

  • Dunn JC (1973) A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters. J Cybern 3(3): 32–57

    Article  MathSciNet  MATH  Google Scholar 

  • Durrett R (2005) Probability: theory and examples. Thomson Brooks/Cole, Belmont

    MATH  Google Scholar 

  • Eagle N, Pentland A (2009) Eigenbehaviors: identifying structure in routine. Behav Ecol Sociobiol 63(7): 1057–1066. doi:10.1007/s00265-009-0739-0

    Article  Google Scholar 

  • Eagle N, Pentland A, Lazer D (2009) Inferring friendship network structure by using mobile phone data. Proc Natl Acad Sci USA. doi:10.1073/pnas.0900282106

  • Foth, M, Forlano, L, Satchell, C, Gibbs, M (eds) (2011) From social butterfly to engaged citizen: urban informatics, social media, ubiquitous computing, and mobile technology to support citizen engagement. MIT Press, Cambridge

    Google Scholar 

  • Freud S (1953) Collected papers, vol IV. vol v. 1–5. Hogarth Press and The Institute of Psychoanalysis, London

    Google Scholar 

  • Geerken M, Gove WR (1983) At home and at work: the family’s allocation of labor. Sage Publications; Published in cooperation with the National Council on Family Relations, Beverly Hills, CA

  • Gonzalez MC, Hidalgo CA, Barabasi A-L (2008) Understanding individual human mobility patterns. Nature 453(7196):779–782. http://www.nature.com/nature/journal/v453/n7196/suppinfo/nature06958_S1.html

    Google Scholar 

  • Goodchild MF, Janelle DG (1984) The city around the clock: space–time patterns of urban ecological structure. Environ Plan A 16(6): 807–820

    Article  Google Scholar 

  • Greaves S (2004) GIS and the collection of travel survey data. In: Hensher DA Handbook of transport geography and spatial systems. Elsevier, New York

    Google Scholar 

  • Gupta S, Rao K, Bhatnagar V (1999) K-means clustering algorithm for categorical attributes. Data Warehous Knowl Discov 1676: 797–797. doi:10.1007/3-540-48298-9_22

    Google Scholar 

  • Hägerstrand T (1989) Reflections on “what about people in regional science?”. Pap Reg Sci 66(1): 1–6

    Article  Google Scholar 

  • Halkidi M, Batistakis Y, Vazirgiannis M (2001) On clustering validation techniques. J Intell Inf Syst 17(2): 107–145. doi:10.1023/a:1012801612483

    Article  MATH  Google Scholar 

  • Hanson S, Hanson P (1980) Gender and urban activity patterns in Uppsala, Sweden. Geogr Rev 70(3): 291–299

    Article  Google Scholar 

  • Hanson S, Kwan M-P (eds) (2008) Transport: critical essays in human geography. 1 edn

  • Harvey A, Taylor M (2000) Activity settings and travel behaviour: a social contact perspective. Transportation 27(1): 53–73. doi:10.1023/a:1005207320044

    Article  Google Scholar 

  • Hastie T, Tibshirani R, Friedman JH (2009) The elements of statistical learning: data mining, inference, and prediction. Springer, Berlin

    MATH  Google Scholar 

  • Huang Z (1998) Extensions to the k-means algorithm for clustering large data sets with categorical values. Data Min Knowl Discov 2(3): 283–304. doi:10.1023/a:1009769707641

    Article  Google Scholar 

  • Jolliffe IT (2002) Principal component analysis. Springer, New York

    MATH  Google Scholar 

  • Kargupta, H, Han, J (eds) (2009) Next generation of data mining. CRC Press, Boca Raton

    Google Scholar 

  • Kim M, Kotz D, Kim S (2006) Extracting a mobility model from real user traces. In: IEEE INFOCOM’06, Barcelona, Spain. doi:citeulike-article-id:903652

  • Kwan M-P (1999) Gender and individual access to urban opportunities: a study using space–time measures. Prof Geogr 51(2): 210–227

    Article  Google Scholar 

  • Li L, Prakash BA (2011) Time series clustering: complex is simpler! In: Proceedings of the 28th international conference on machine learning

  • Maslow AH, Frager R (1987) Motivation and personality. Harper and Row, New York

    Google Scholar 

  • Nature Editorial (2008) A flood of hard data. Nature 453(7196):698

    Google Scholar 

  • Ordonez C (2003) Clustering binary data streams with K-means. Paper presented at the proceedings of the 8th ACM SIGMOD workshop on research issues in data mining and knowledge discovery, San Diego, California

  • Portugali, J, Meyer, H, Stolk, E, Tan, E (eds) (2012) Complexity theories of cities have come of age: an overview with implications to urban planning and design. Springer, Berlin

    Google Scholar 

  • Ralambondrainy H (1995) A conceptual version of the K-means algorithm. Pattern Recognit Lett 16(11): 1147–1157. doi:10.1016/0167-8655(95)00075-r

    Article  Google Scholar 

  • Reggiani, A, Nijkamp, P (eds) (2009) Complexity and spatial networks: in search of simplicity. Springer, Berlin

    Google Scholar 

  • Rousseeuw PJ (1987) Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math 20: 53–65

    Article  MATH  Google Scholar 

  • Sang S, O’Kelly M, Kwan M-P (2011) Examining commuting patterns. Urban Stud 48(5): 891–909. doi:10.1177/0042098010368576

    Article  Google Scholar 

  • Shen Q (1998) Location characteristics of inner-city neighborhoods and employment accessibility of low-wage workers. Environ Plan B Plan Des 25(3): 345–365

    Article  Google Scholar 

  • Song C, Qu Z, Blumm N, Barabási A-L (2010) Limits of predictability in human mobility. Science 327(5968): 1018–1021. doi:10.1126/science.1177170

    Article  MathSciNet  MATH  Google Scholar 

  • Taylor PJ, Parkes DN (1975) A Kantian view of the city: a factorial-ecology experiment in space and time. Environ Plan A 7(6): 671–688

    Article  Google Scholar 

  • Turk M, Pentland A (1991) Eigenfaces for recognition. J Cogn Neurosci 3(1): 71–86. doi:10.1162/jocn.1991.3.1.71

    Article  Google Scholar 

  • Waddell P (2002) UrbanSim: modeling urban development for land use, transportation and environmental planning. J Am Plan Assoc 68(3): 297–314

    Article  Google Scholar 

  • Wang D, Pedreschi D, Song C, Giannotti F, Barabási A-L (2011a) Human mobility, social ties and link prediction. Paper presented at the 17th ACM SIGKDD conference on knowledge discovery and data mining (KDD’11)

  • Wang D, Wen Z, Tong H, Lin C-Y, Song C, Barabási A-L (2011b) Information spreading in context. Paper presented at the proceedings of the 20th international conference on World wide web, Hyderabad, India

  • Wang P, González MC, Hidalgo CA, Barabási A-L (2009) Understanding the spreading patterns of mobile phone viruses. Science 324(5930): 1071–1076. doi:10.1126/science.1167053

    Article  Google Scholar 

  • Wu X, Kumar V, Ross Quinlan J, Ghosh J, Yang Q, Motoda H, McLachlan G, Ng A, Liu B, Yu P, Zhou Z-H, Steinbach M, Hand D, Steinberg D (2008) Top 10 algorithms in data mining. Knowl Inf Syst 14(1): 1–37. doi:10.1007/s10115-007-0114-2

    Article  Google Scholar 

  • Xu R, Wunsch DC (2008) Partitional clustering. In: Clustering. Wiley, pp 63–110. doi:10.1002/9780470382776.ch4

  • Yang J, Leskovec J (2011) Patterns of temporal variation in online media. Paper presented at the proceedings of the fourth ACM international conference on Web search and data mining, Hong Kong, China

  • Yu H, Shaw S-L (2008) Exploring potential human activities in physical and virtual spaces: a spatio-temporal GIS approach. Int J Geogr Inf Sci 22(4): 409–430

    Article  Google Scholar 

  • Zha H, Ding C, Gu M, He X, Simon H (2001) Spectral relaxation for K-means clustering. Adv Neural Inf Process Syst 14(NIPS’01): 1057–1064

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marta C. González.

Additional information

Responsible editor: Fei Wang, Hanghang Tong, Phillip Yu, Charu Aggarwal.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jiang, S., Ferreira, J. & González, M.C. Clustering daily patterns of human activities in the city. Data Min Knowl Disc 25, 478–510 (2012). https://doi.org/10.1007/s10618-012-0264-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10618-012-0264-z

Keywords

Navigation