Skip to main content
Log in

Urban-semantic computer vision: a framework for contextual understanding of people in urban spaces

  • Main Paper
  • Published:
AI & SOCIETY Aims and scope Submit manuscript

Abstract

Increasing computational power and improving deep learning methods have made computer vision technologies pervasively common in urban environments. Their applications in policing, traffic management, and documenting public spaces are increasingly common (Ridgeway 2018, Coifman et al. 1998, Sun et al. 2020). Despite the often-discussed biases in the algorithms' training and unequally borne benefits (Khosla et al. 2012), almost all applications similarly reduce urban experiences to simplistic, reductive, and mechanistic measures. There is a lack of context, depth, and specificity in these practices that enables semantic knowledge or analysis within urban contexts, especially within the context of using and occupying urban space. This paper will critique existing uses of artificial intelligence and computer vision in urban practices to propose a new framework for understanding people, action, and public space. This paper revisits Geertz's (1973) use of thick descriptions in generating interpretive theories of culture and activity and uses this lens to establish a framework to approach evaluating the varied uses of computer vision technologies that weigh meaning. By discussing cases of implemented examples of urban computer vision—from LinkNYC and Numina's urban measurements to the Detroit Police's use of DataWorks Plus's facial recognition technology—it proposes a framework for evaluating the thickness of the algorithm's conclusions against the computational method's complexity required to produce that outcome. Further, we discuss how the framework's positioning may differ (and conflict) between different users of the technology, from engineer to urban planner and policymaker, to citizen. This paper also discusses how the current use and training of deep learning algorithms and how this process limits semantic learning and proposes three potential methodologies toward gaining a more contextually specific, urban-semantic, description of urban space relevant to urbanists. This paper contributes to the critical conversations regarding the proliferation of artificial intelligence by challenging the current applications of these technologies in the urban environment by highlighting their failures within this context while also proposing an evolution of these algorithms that may ultimately make them sensitive and useful within this spatial and cultural milieu.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Data availability

No datasets were generated or analyzed during the current study.

Notes

  1. “Imageable” here is taken to mean a cognitive, memory-based image as was used in the previous reference of Lynch’s work.

  2. These together form the commonly used “four V’s” of big data: velocity, veracity, volume and variety.

  3. While this paper will not comprehensively review the technology and its application depth, other papers have sought to categorize various approaches. See Ibrahim et al. 2020, for instance.

References

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anthony Vanky.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Vanky, A., Le, R. Urban-semantic computer vision: a framework for contextual understanding of people in urban spaces. AI & Soc 38, 1193–1207 (2023). https://doi.org/10.1007/s00146-022-01625-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00146-022-01625-6

Keywords

Navigation