Towards human-AI collaborative urban science research enabled by pre-trained large language models

Fu, Jiayi; Han, Haoying; Su, Xing; Fan, Chao

doi:10.1007/s44212-024-00042-y

Towards human-AI collaborative urban science research enabled by pre-trained large language models

Perspective
Open access
Published: 29 April 2024

Volume 3, article number 8, (2024)
Cite this article

Download PDF

You have full access to this open access article

Urban Informatics Aims and scope Submit manuscript

Towards human-AI collaborative urban science research enabled by pre-trained large language models

Download PDF

Jiayi Fu¹,
Haoying Han^1,2,
Xing Su¹ &
…
Chao Fan³

333 Accesses
Explore all metrics

Abstract

Pre-trained large language models (PLMs) have the potential to support urban science research through content creation, information extraction, assisted programming, text classification, and other technical advances. In this research, we explored the opportunities, challenges, and prospects of PLMs in urban science research. Specifically, we discussed potential applications of PLMs to urban institution, urban space, urban information, and citizen behaviors research through seven examples using ChatGPT. We also examined the challenges of PLMs in urban science research from both technical and social perspectives. The prospects of the application of PLMs in urban science research were then proposed. We found that PLMs can effectively aid in understanding complex concepts in urban science, facilitate urban spatial form identification, assist in disaster monitoring, sense public sentiment and so on. They have expanded the breadth of urban research in terms of content, increased the depth and efficiency of the application of multi-source big data in urban research, and enhanced the interaction between urban research and other disciplines. At the same time, however, the applications of PLMs in urban science research face evident threats, such as technical limitations, security, privacy, and social bias. The development of fundamental models based on domain knowledge and human-AI collaboration may help improve PLMs to support urban science research in future.

Interpreting the Smart City Through Topic Modeling

Sailing the Data Sea to Advance Research on the Sustainable Development Goals

Integrative urban AI to expand coverage, access, and equity of urban data

Article 09 April 2022

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

As the most intricate creation of humankind, cities are convoluted systems comprised of multiple dimensions and factors. Consequently, urban research has evolved into a complex and significant social undertaking (Emmi, 2008; Marshall, 2012). Furthermore, the technological revolution, the proliferation of big data in cities, and the dissemination of artificial intelligence have not only transformed cities but have also altered the manner in which urban researchers investigate them (C. Wang & Yin, 2023) technologies such as Machine Learning (ML), Deep Learning (DL), and their applications in Natural Language Processing (NLP) and Computer Vision (CV) have gained extensive usage in the realm of urban science research (Cai, 2021; Casali et al., 2022; J. Wang & Biljecki, 2022). These emerging technologies pose an opportunity to traditional urban research methodologies and propel urban research towards a quantitative, computational and intelligent direction. However, despite their promising potential, several obstacles hinder their applications, such as low robust performance (Goodfellow et al., 2014), algorithmic and technical constraints (Cai, 2021), and insufficient semantic comprehension (Bender & Koller, 2020). Whether these issues can be resolved via novel technologies or tools constitutes a topic worthy of examination in current urban science research.

Pre-trained large language models (PLMs), such as ChatGPT (OpenAI, 2022), have the potential to play a pivotal role in tackling these challenges. PLMs are a new paradigm of NLP (Li, 2021) that can be pre-trained on large-scale text corpora using self-supervised learning to simplify various complex natural language processing issues into straightforward fine-tuning problems (Qiu et al., 2020). At present, PLMs have evolved monolingual such as BERT (Devlin et al., 2019), GPT (OpenAI, 2023), and multilingual training models such as mBERT (Devlin et al., 2019), XLM-R (Conneau et al., 2020). One notable model among them is ChatGPT, a large language model that has been trained in PLMs based on autoregressive language (Else, 2023; J. Yang et al., 2023). By integrating various cutting-edge techniques such as unsupervised learning, and instruction fine-tuning (Wu et al., 2023), ChatGPT boasts formidable content generation capabilities through its artificial intelligence generated content (AIGC) technology. This capability enables PLMs to independently learn from data and produce sophisticated and seemingly intelligent outcomes (van Dis et al., 2023). Due to its exceptional proficiency in text learning, text classification, information extraction, and text generation (Owens, 2023; F. Wang et al., 2023; F. Wang, Yang, et al., 2023), PLMs have demonstrated its immense potential in diverse fields including finance (Dowling & Lucey, 2023), medicine (Biswas, 2023b; Jungwirth & Haluza, 2023; Verhoeven et al., 2023), education (Kooli, 2023; H. Yang, 2023), and environment (An et al., 2023; Biswas, 2023a; Zhu et al., 2023). PLMs are expected to play a crucial role in advancing urban research through various means, such as simplifying the interpretation of complex urban concepts, automating repetitive tasks programming on analyzing urban data, and improving the utilization of multi-disciplinary knowledge for urban science research (see Fig. 1).

To make our perspective more comprehensive in covering the application of PLMs in urban study, we refer to the sustainable urban systems model proposed by Meerow (Meerow & Newell, 2019). Urban study usually includes four subsystems: economic, social, physical, and environmental. Sustainable urban system covers governance and planning, innovation and competitiveness, lifestyle and consumption, resource management and climate, mitigation and adaptation, transport and accessibility, buildings, spatial environment and public space, and other dimensions (McCormick et al., 2013; Webb et al., 2018). Based on this, we also fine-tune the urban study themes according to the features and functions of PLMs, focusing on the application of PLMs in four dimensions: urban decision-making and governance, urban space, urban information and citizen behaviors. We will illustrate the useful contents of PLMs applied in the above four aspects, and examine potential issues and challenges facing PLMs in urban research from both technical and social perspectives. Finally, we discuss the possible directions of PLMs in future urban research.

2 Opportunities

2.1 Urban institution

The domain of urban institutional research comprises a wide range of topics, including but not limited to institutional design, public policy development, public comprehension of policy, and public policy response (Farazmand, 2023). This involves handling a significant amount of text-based data. Traditional institutional text analysis usually adopts social network analysis, text semantic analysis and other methods, which need to input and code the research text through relevant software, and the procedure is more complicated. As AI models, PLMs offer superior intelligent question and answer, text classification, and text generation capabilities. It extends the level of interaction between the researcher and the research document, making the interaction between them more efficient and convenient. PLMs can comprehend the queries or questions of the researcher and respond with accurate and lucid language in both restricted or open domain Q&A (Wu et al., 2023). PLMs can also extract crucial information from a city system document to provide a summary of the main content of the document (Min et al., 2021). Moreover, text classification is a distinct advantage of PLMs. PLMs can tell the positive and negative sentiment of texts, which aid researchers in obtaining prompt public feedback on urban institutions, discerning the public's key requirements for institutions or policies (Karduni & Sauda, 2020), and enables policymakers to comprehend the underlying reasons for public endorsement or opposition to urban institutions (Luo et al., 2019). This utilization of public opinion helps to advance the construction of urban institutions. We will demonstrate the potential of PLMs in urban institution research through two examples.

PLMs provide assistance to urban researchers in tasks such as information retrieval, summarization, and tracking of urban institutions and documents. As the example shown in Table 1, we utilized ChatGPT to obtain five institutional documents concerning urban land use. The PLMs effectively retrieved and cited the relevant documents through the assistance of the WebGPT plug-in.

Table 1 ChatGPT for searching documents (generated on April 22, 2023)

Full size table

In addition, PLMs possess remarkable capabilities for aggregation, allowing for the extraction of pertinent information from extensive materials and the automatic extraction of key points. As an illustration, we provided ChatPDF (an open tool based on the ChatGPT API) with a report by the President's Council of Advisors on Science and Technology (PCAST), Technology and the Future of Cities. We requested that ChatPDF extract and summarize the primary points of the document, as well as respond to specific inquiries regarding particular topics. ChatGPT successfully sorted and condensed the content in the report as requested. It also managed to locate and respond to the specific content in the institutional document while indicating the source of the answer (see Table 2).

Table 2 ChatGPT for summarizing documents (generated on April 22, 2023)

Full size table

PLMs are capable of acquiring and elucidating complex urban concepts, which can be particularly useful for researchers without a background in urban research. It can explain relevant terminology without requiring additional context. For instance, we tasked ChatGPT with explaining the meanings of various concepts that we had identified as relevant to urban research, such as "Spatial Planning", "Metropolitan Area", "Smart City", and "Carbon Neutral". As expected, ChatGPT was able to provide precise and accurate explanations of these concepts (see Table 3).

Table 3 ChatGPT's feedback for explaining concepts (generated on May 2, 2023)

Full size table

2.2 Urban space

The study of urban space covers multiple dimensions such as geographic location, spatial form, spatial structure, land use, architectural form, and urban landscape (Koumetio Tekouabou et al., 2023; Sharifi et al., 2023). These dimensions involve diverse textual and non-textual data sources. Traditional urban spatial research often takes images or figures as the object of study, and is often limited by processing technology and other aspects in actual research. The advent of PLMs presents novel approaches for integrating multi-source data in urban space study. Due to its advanced natural language processing capabilities, ChatGPT is able to carry out tasks such as code generation and modification (Merow et al., 2023; Sobania et al., 2023). This enhances the efficiency of urban spatial research by assisting in programming and streamlining the integration of novel data sources like cell phone signaling, points of interest (POI) into urban spatial research. The multimodal understanding of PLMs also helps to facilitate the use of images such as city streetscapes for urban research as a means of expanding the depth and breadth of urban research data. Two examples will be used to demonstrate the opportunities of PLMs in urban space study.

We used POI data to delineate central city boundaries, as an illustrative example. In this process, PLMs can assist with remote sensing imagery analysis, kernel density analysis, and other methods. For instance, we could query ChatGPT for guidance on "batch processing image cropping using ArcGIS, along with Python code and explanations," and employ ArcPy to execute the command (see Table 4). PLMs can also assist with POI data crawling. We can make a request to ChatGPT: "How can we crawl POI data with permission through the AMap (http://lbs.amap.com) API?" ChatGPT can then provide the relevant code for POI data crawling (see Table 4). PLMs have the ability to assist with programming, which makes them useful in urban streetscape recognition. One application of PLMs is to use models such as convolutional neural networks (CNNs) for urban landscape recognition. For instance, we utilized ChatGPT to help us construct a CNN model for the identification of street trees in urban streetscapes, using Python code (see Table 4).

Table 4 ChatGPT output code for ArcGIS operations, POI acquisition, street view recognition (generated on April 17, 2023)

Full size table

2.3 Urban information

Urban information refers to information generated by a multitude of data sources such as information and communication technologies (ICT), remote and physical sensors, and individuals (C. Wang & Yin, 2023), and encompasses a wide range of topics including urban traffic, logistics, environment, disasters, and various types of urban economic information (Ismagilova et al., 2019). Similarly, traditional urban information research has been limited to a single mode of access and processing, and has been more limited by the technological level of the researcher. PLMs have helped to advance urban information research by expanding the mode of access from the traditional single-sensor access to encompassing a wide range of web and social media. It also expands the processing of urban information, not only identifing geographic information in text, but also realizing the monitoring and prediction of disasters, housing prices, and traffic flow by assisting the writing of codes and using natural language processing, text mining, and machine learning. Here are two examples that illustrate the potential of PLMs in urban information research.

PLMs possess the capacity for helping monitor and predict natural disasters or public health events. Firstly, as an important function of PLMs, text mining has the ability to identify and extract disaster-related information from diverse sources, such as news articles, social media, and emergency reports. This information includes the time, location, and magnitude of the disaster. Secondly, the natural language reasoning capabilities of PLMs can aid in solving various comprehension and reasoning tasks, including scenario estimation for disaster monitoring and generating corresponding monitoring reports (Zheng et al., 2023). Additionally, time series analysis of disaster texts aids in achieving disaster prediction. As the example shown in Table 5, we supplied ChatGPT with a text describing a disaster (extracted from a web report of the 2022 floods in Assam, India), and requested it to identify the time and location of the disaster and provide location details.

Table 5 ChatGPT aids in disaster monitoring (generated on May 2, 2023)

Full size table

PLMs are capable of assisting the forecasting of urban information, including housing prices, by utilizing various data sources such as demographic data, real estate listings, and local economic indicators. Moreover, we can perform data analysis to forecast future house prices in a particular area with the aid of auxiliary programming. As an illustration, we could request ChatGPT to construct a random forest model to predict the future trend of housing prices and provide us with the code for this prediction (see Table 6).

Table 6 ChatGPT aids in predicting house prices (generated on May 2, 2023)

Full size table

2.4 Citizen behaviors

Research on citizen behaviors in cities covers issues such as public sentiment, population mobility, travel behavior, poverty and crime (Sharifi et al., 2023). PLMs can help to study these issues, especially as a powerful tool to complement complex human-centered tasks. With their remarkable language processing capabilities, PLMs can parse social media texts, discern public events, track population movements, and monitor criminal activity, among other tasks. PLMs also possess powerful capabilities in sentiment analysis (Abdul-Rahman et al., 2021) and gesture monitoring (Zhang et al., 2022). They can analyze the sentiment of posts, online comments, and various types of news or stories, and categorize them as either positive or negative, pros or cons. In addition, PLMs can analyze sentiment trends over time and detect significant changes in public opinion (F. Wang, Yang, et al., 2023). This capability facilitates a comprehensive analysis of the shift in public sentiment towards an event or a location and encourages the utilization of social media data in urban research (Abdul-Rahman et al., 2021). As an illustration, we presented ChatGPT with a set of paragraphs describing the utilization of the OpenAI API and its Tweet classifier to perform sentiment analysis on a comment regarding a certain park. ChatGPT was able to accurately identify the sentiment tendencies present in the comment (see Table 7).

Table 7 OpenAI API for sentiment analysis (generated on April 23, 2023)

Full size table

We summarize the possible applications of PLMs to urban institution, urban space, urban information, and citizen behaviors (see Table 8).

Table 8 Summary of PLMs applications in urban research

Full size table

3 Challenges

3.1 Technical perspective

3.1.1 Technical restrictions

Time restrictions: PLMs require vast amounts of data to be trained for initial models. For instance, ChatGPT's training data only goes up to June 2021 (Zhu et al., 2023). This means that ChatGPT can only understand and infer information from 2021 and earlier, making it challenging to update to the most current data (Teubner et al., 2023). Therefore, when asked to provide ten authoritative papers on urban research, ChatGPT was unable to provide current research papers in real time due to data training time constraints (see Table 9).

Table 9 ChatGPT's feedback for providing papers and collecting data (generated on April 22, 2023)

Full size table

Permission restrictions: The issue of data restrictions in PLMs is further exacerbated by the incompleteness and inaccessibility of big data (Salganik, 2018). Although ChatGPT is capable of searching networks and providing citation source annotations after using the WebGPT plugin, there are still significant limitations in data collection, such as inaccessible cell phone signaling and travel data. This hinders researchers from using PLMs to obtain authoritative information for urban studies. As an example shown in Table 9, when trying to study urban demographic characteristics, we attempted to ask ChatGPT about the current demographic characteristics of each province in China. However, ChatGPT indicated that it was unavailable due to training data limitations and access restrictions. This indicates that researchers still need to manually retrieve data from specialized databases instead of relying solely on PLMs for conducting research on recent data.

Modality restrictions: Presently, the multi-modality of PLMs is mainly exhibited in the inference and analysis of data and text, as well as basic image processing. For other modes like audio, and video, plug-ins and auxiliary programming are often required (J. Yang et al., 2023). It is also challenging for PLMs to directly recognize remote sensing images in urban research, and it is difficult to conduct application in urban soundscape and urban images.

Technical restrictions: although PLMs, as exemplified by ChatGPT, have an automatic question and answer function that can help solve problems step by step. But the benefits of PLMs for researchers with different skill levels are still different. PLMs are not friendly enough for beginners in urban research. For example, for an urban research question, a beginner may not know how to set up an appropriate prompt and it is difficult to judge whether ChatGPT's answer is accurate or not.

3.1.2 Authenticity and validity

On one hand, it is worth noting that PLMs may generate false information, including fabricated literature (Jungwirth & Haluza, 2023) and factual errors (hallucinations) (Wu et al., 2023), particularly in low-resource settings. On the other hand, the performance of PLMs is not uniformly consistent and stable, which may result in disparate responses to the same query (Liu et al., 2023). For instance, when we inquired about "information on the fourth census of China", ChatGPT provided wholly inconsistent data, which could lead to entirely erroneous conclusions in urban studies (see Table 10). It can be observed that ChatGPT does not currently offer a precise and dependable source of information for urban research, nor does it have the capability to effectively integrate diverse types of knowledge. To ensure the reliability and accuracy of PLMs' output information, particularly regarding issues concerning temporal and numerical dimensions like urban time series change and population change, a more rigorous validation approach is necessary.

Table 10 ChatGPT's feedback on data retrieval (generated on April 23, 2023)

Full size table

3.1.3 Comprehension skills

PLMs, being a form of artificial intelligence, are essentially based on inferences about statistical relationships and currently lack the higher-order thinking skills to understand context and nuance (Liu et al., 2023). In the context of complex urban research, this shortcoming can lead to the production of inaccurate data and misinterpretations (Kooli, 2023), resulting in responses that lack depth and insight or even deviate from the intended topic (Farrokhnia et al., 2023).

Furthermore, most PTMs are trained using general-purpose data sources, such as Wikipedia, which can limit their effectiveness in specific domains (Qiu et al., 2020). For instance, when prompted to provide information on the "evolutionary patterns of Chinese landscape", ChatGPT could only offer superficial observations, struggling to grasp the underlying evolutionary patterns (see Table 11). Consequently, generic PLMs continue to face limitations in comprehending intricate urban theories or patterns. While there exist PLMs that specialize in geography, such as ERNIE-GeoL, GeoBERT, and SpaBERT, their current use in the field of urban research is restricted by permissions and limited functionality, such as the classification and matching of POI, address segmentation, and geographic entity coding.

Table 11 ChatGPT's feedback for summarizing patterns (generated on April 23, 2023)

Full size table

3.2 Sociological perspective

3.2.1 Lack of trust

The technical black box is an important feature of AI development (Yigitcanlar & Cugurullo, 2020). The PLMs, such as ChatGPT, are capable of providing feedback to users, yet they are incapable of elucidating the computational process that underlies their decision-making and predictive capabilities (Sanderson, 2023). This limitation leads to a dilemma in the application of PLMs in urban research. On the one hand, PLMs cannot guarantee the source and reference of generated information. The opacity of PLMs could potentially result in significant consequences when dealing with certain NLP tasks that demand high precision in the context of urban research. On the other hand, the public, who is one of the focal groups of urban research, may not have confidence that their private information is not being utilized for data retrieval and processing, thereby undermining public trust in PLMs. Consequently, PLMs need to augment their transparency and traceability, through algorithmic optimization or legal regulations, to address the expectations of both researchers and the public.

3.2.2 Social bias and discrimination

The training data for PLMs is typically obtained from publicly available web resources. However, there exists a significant amount of biased data on the internet, including information related to race, religion, and gender, among others (Buolamwini & Gebru, 2018). This bias can persist and be reflected in PLMs after training (Farrokhnia et al., 2023; Jungwirth & Haluza, 2023). Such biases in the model can have a harmful impact on the relevant groups of the public, perpetuating stereotypes and derogatory images (Brown et al., 2020). Furthermore, the population that utilizes internet resources has certain group characteristics, resulting in training samples that are biased and fail to accurately reflect the requirements of marginalized groups (J. Yang et al., 2023). Geographic bias is one of the larger barriers to the application of PLMs to urban research. Effective urban research data tends to be concentrated in more resource-rich areas, which will lead to significant geographic differences and inequities in model training due to different resources. And one of the purposes of urban research is to promote equal and sustainable development of urban citizens (Meerow et al., 2019). Discrimination and prejudice can have significant social harm, even with minor deviations (Liang et al., 2022), resulting in unreasonable allocation of urban space, unjust public decision-making, and widening urban–rural divide.

3.2.3 Threat to information safety

Security and privacy are key issues to consider in the applications of PLMs to urban research. By virtue of utilizing researchers' queries and input data as their training material (Clarke, 2023), PLMs may potentially give rise to issues of data leakage and data theft. Such circumstances can result in data leakages of individuals and cities, thereby threatening personal privacy and city security. PLMs, such as ChatGPT, is possible to steal personal information from cities or the public through phishing emails and malware (Wu et al., 2023), thus threatening city security and personal privacy. Furthermore, trained data by PLMs may be biased or erroneous, potentially yielding harmful output. PLMs are highly communicative and interactive. If harmful content is disseminated in large quantities, it can trigger a serious "infodemic" phenomenon (De Angelis et al., 2023; Zarocostas, 2020), generating mass anxiety, hate speech, and even urban riots, thereby jeopardizing urban public safety. Consequently, researchers should be circumspect with respect to sensitive information provided to PLMs, while simultaneously considering the security of PLMs' answers and strengthening the safety of urban and personal information.

4 Future directions

Based on the aforementioned exploration of the opportunities and challenges surrounding PLMs applied to urban research, we put forward several potential avenues that can enhance the role of PLMs in urban science research:

First of all, fundamental models based on urban research areas can be developed. As a consequence of the requirement for extensive model multimodal applications in urban research, coupled with the restrictions on using current models, the development of fundamental models within the realm of urban research could emerge as a novel avenue (F. Wang et al., 2023). This approach would incorporate multimodal applications, such as text, data, image, audio, and video, to extend the utilization of multi-source big data in urban research. In particular, the input and output of images will further facilitate the study of urban 2D and 3D spaces. Also, the foundational models customized for urban research could enhance the accuracy and precision of results, facilitating more intricate urban research tasks, such as the exploration of complex urban theories and laws.

Secondly, human-AI collaboration can be applied to facilitate urban research. The text analysis, abstract summarization, and assisted programming capabilities of PLMs have the potential to significantly enhance the research efficiency of urban researchers. PLMs can help strengthen the cross-fertilization of urban research with other disciplines to enhance the diversity and innovation of urban research (van Dis et al., 2023). Furthermore, the integration of PLMs with emerging techniques such as deep learning can aid researchers in overcoming technical limitations and adapting to new urban research methods in the context of big data. This, in turn, would allow researchers to focus more on urban theoretical research and paradigm innovation. Finally, PLMs are expected to provide technical support for new directions in urban research, such as digital twin cities.

Thirdly, PLMs can be used to improve public participation and urban decision-making. On one hand, PLMs, such as ChatGPT, possess natural language interaction capabilities, which can be utilized to disseminate urban information to the public, thus advancing their comprehension and participation in urban research (Casares, 2018). PLMs are also expected to promote urban research by understanding cities from a more human perspective through deep learning of public opinions. On the other hand, PLMs are poised to provide crucial assistance for urban decision-making, mitigating the undue impact of subjective factors on urban decision-making, and proposing ideas for the optimization of urban decision-making.

Finally, there is a need to be wary of falsehood, privacy, and liability issues. As previously mentioned, issues such as limited data, falsity, and social bias are major concerns that need to be addressed. However, there is no clear consensus on how ChatGPT can regulate these issues related to accuracy, privacy, and liability. As such, it is important to exercise caution and skepticism when using PLMs, to improve our judgment on PLMs answers and to view them as tools rather than relying on them completely(Krügel et al., 2023).

5 Conclusion

In this paper, we discuss the opportunities and challenges of PLMs in urban science research, using ChatGPT as an example. PLMs play a crucial role in the study of urban institution, urban space, urban information, and citizen behaviors. Firstly, PLMs expand the breadth of urban research by covering a wide range of urban research content on a basic platform, such as urban institutional document analysis, urban spatial pattern identification, disaster simulation and prediction, and human emotion analysis. Secondly, PLMs improve the depth of multi-source big data applied to urban research through assisted programming and other means, help researchers weaken subjective bias in qualitative analysis, and improve the efficiency of urban researchers. Finally, PLMs promote the participation of multiple subjects in urban research through human–computer interaction, and also provide new ideas for urban research.

Nevertheless, PLMs still confront numerous challenges in urban research. The issues of temporal limitation, authoritative limitation, modality limitation, credibility, and weak comprehension have been exposed in studies and still pose multiple challenges. Public trust, social biases, and public safety represent significant limitations to the practical applications of PLMs in urban research. These issues require further discussion and consideration. These are issues that researchers must be aware of and wary of when utilizing ChatGPT for urban research.

PLMs will become a potent instrument for urban researchers, especially to complement complex human-centered tasks. We hope to further promote the application of PLMs in urban research through the development of FMs based on urban research domains, to enhance the innovation of new urban research paradigms in the context of big data, and to promote the dissemination and application of urban research results.

Availability of data and materials

The author confirms that all data generated or analysed during this study are included in this published article.

References

Abdul-Rahman, M., Chan, E. H. W., Wong, M. S., Irekponor, V. E., & Abdul-Rahman, M. O. (2021). A framework to simplify pre-processing location-based social media big data for sustainable urban planning and management. Cities, 109, 102986. https://doi.org/10.1016/j.cities.2020.102986
Article Google Scholar
An, J., Ding, W., & Lin, C. (2023). ChatGPT: Tackle the growing carbon footprint of generative AI. Nature, 615(7953), 586–586. https://doi.org/10.1038/d41586-023-00843-2
Article ADS CAS PubMed Google Scholar
Bender, E. M., & Koller, A. (2020). Climbing towards NLU: On Meaning, Form, and Understanding in the Age of Data. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 5185–5198. https://doi.org/10.18653/v1/2020.acl-main.463
Biswas, S. S. (2023a). Potential Use of Chat GPT in Global Warming. Annals of Biomedical Engineering. https://doi.org/10.1007/s10439-023-03171-8
Article PubMed Google Scholar
Biswas, S. S. (2023b). Role of Chat GPT in Public Health. Annals of Biomedical Engineering. https://doi.org/10.1007/s10439-023-03172-7
Article PubMed Google Scholar
Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D., Wu, J., Winter, C., … Amodei, D. (2020). Language Models are Few-Shot Learners. Advances in Neural Information Processing Systems, 33, 1877–1901. https://proceedings.neurips.cc/paper/2020/hash/1457c0d6bfcb4967418bfb8ac142f64a-Abstract.html
Buolamwini, J., & Gebru, T. (2018). Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification. Proceedings of the First Conference on Fairness, Accountability and Transparency, 77–91. https://proceedings.mlr.press/v81/buolamwini18a.html
Cai, M. (2021). Natural language processing for urban research: A systematic review. Heliyon, 7(3), e06322. https://doi.org/10.1016/j.heliyon.2021.e06322
Article PubMed PubMed Central Google Scholar
Casali, Y., Aydin, N. Y., & Comes, T. (2022). Machine learning for spatial analyses in urban areas: A scoping review. Sustainable Cities and Society, 85, 104050. https://doi.org/10.1016/j.scs.2022.104050
Article Google Scholar
Casares, A. P. (2018). The brain of the future and the viability of democratic governance: The role of artificial intelligence, cognitive machines, and viable systems. Futures, 103, 5–16. https://doi.org/10.1016/j.futures.2018.05.002
Article Google Scholar
Clarke, L. (2023). Call for AI pause highlights potential dangers. Science (New York, NY), 380(6641), 120–121. https://www.science.org/doi/https://doi.org/10.1126/science.adi2240
Conneau, A., Khandelwal, K., Goyal, N., Chaudhary, V., Wenzek, G., Guzmán, F., Grave, E., Ott, M., Zettlemoyer, L., & Stoyanov, V. (2020, April 7). Unsupervised Cross-lingual Representation Learning at Scale. http://arxiv.org/abs/1911.02116
De Angelis, L., Baglivo, F., Arzilli, G., Privitera, G. P., Ferragina, P., Tozzi, A. E., & Rizzo, C. (2023). ChatGPT and the rise of large language models: The new AI-driven infodemic threat in public health. Frontiers in Public Health, 11. https://www.frontiersin.org/articles/https://doi.org/10.3389/fpubh.2023.1166120
Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019, May 24). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. http://arxiv.org/abs/1810.04805
Dowling, M., & Lucey, B. (2023). ChatGPT for (Finance) research: The Bananarama Conjecture. Finance Research Letters, 53, 103662. https://doi.org/10.1016/j.frl.2023.103662
Article Google Scholar
Else, H. (2023). Abstracts written by ChatGPT fool scientists. Nature, 613(7944), 423–423. https://doi.org/10.1038/d41586-023-00056-7
Article ADS CAS PubMed Google Scholar
Emmi, P. C. (2008). Urban Complexity and Spatial Strategies: Towards a Relational Planning for Our Times: Patsy Healey. Routledge, London, 2006. 352 pages. $51.95. Journal of the American Planning Association, 74(1), 137–137. https://doi.org/10.1080/01944360701755584
Farazmand, A. (2023). Global encyclopedia of public administration, public policy, and governance. Springer Nature.
Farrokhnia, M., Banihashem, S. K., Noroozi, O., & Wals, A. (2023). A SWOT analysis of ChatGPT: Implications for educational practice and research. Innovations in Education and Teaching International, 0(0), 1–15. https://doi.org/10.1080/14703297.2023.2195846
Goodfellow, I. J., Shlens, J., & Szegedy, C. (2014, December 20). Explaining and Harnessing Adversarial Examples. arXiv.Org. https://arxiv.org/abs/1412.6572v3
Ismagilova, E., Hughes, L., Dwivedi, Y. K., & Raman, K. R. (2019). Smart cities: Advances in research—An information systems perspective. International Journal of Information Management, 47, 88–100. https://doi.org/10.1016/j.ijinfomgt.2019.01.004
Article Google Scholar
Jungwirth, D., & Haluza, D. (2023). Artificial Intelligence and Public Health: An Exploratory Study. International Journal of Environmental Research and Public Health, 20(5), Article 5. https://doi.org/10.3390/ijerph20054541
Karduni, A., & Sauda, E. (2020). Anatomy of a Protest: Spatial Information, Social Media, and Urban Space. Social Media + Society, 6(1), 205630511989732. https://doi.org/10.1177/2056305119897320
Kooli, C. (2023). Chatbots in Education and Research: A Critical Examination of Ethical Implications and Solutions. Sustainability, 15(7), Article 7. https://doi.org/10.3390/su15075614
Koumetio Tekouabou, S. C., Diop, E. B., Azmi, R., & Chenal, J. (2023). Artificial Intelligence Based Methods for Smart and Sustainable Urban Planning: A Systematic Survey. Archives of Computational Methods in Engineering, 30(2), 1421–1438. https://doi.org/10.1007/s11831-022-09844-2
Article Google Scholar
Krügel, S., Ostermaier, A., & Uhl, M. (2023). ChatGPT’s inconsistent moral advice influences users’ judgment. Scientific Reports, 13(1), Article 1. https://doi.org/10.1038/s41598-023-31341-0
Li, X. (2021). Examining the spatial distribution and temporal change of the green view index in New York City using Google Street View images and deep learning. Environment and Planning B: Urban Analytics and City Science, 48(7), Article 7. https://doi.org/10.1177/2399808320962511
Liang, P., Bommasani, R., Lee, T., Tsipras, D., Soylu, D., Yasunaga, M., Zhang, Y., Narayanan, D., Wu, Y., Kumar, A., Newman, B., Yuan, B., Yan, B., Zhang, C., Cosgrove, C., Manning, C. D., Ré, C., Acosta-Navas, D., Hudson, D. A., … Koreeda, Y. (2022, November 16). Holistic Evaluation of Language Models. http://arxiv.org/abs/2211.09110
Liu, Y., Han, T., Ma, S., Zhang, J., Yang, Y., Tian, J., He, H., Li, A., He, M., Liu, Z., Wu, Z., Zhu, D., Li, X., Qiang, N., Shen, D., Liu, T., & Ge, B. (2023). Summary of ChatGPT/GPT-4 Research and Perspective Towards the Future of Large Language Models. https://doi.org/10.48550/ARXIV.2304.01852
Luo, X., Tong, S., Fang, Z., & Qu, Z. (2019). Frontiers: Machines vs. Humans: The Impact of Artificial Intelligence Chatbot Disclosure on Customer Purchases. Marketing Science, mksc.2019.1192. https://doi.org/10.1287/mksc.2019.1192
Marshall, S. (2012). Planning, Design and the Complexity of Cities. In J. Portugali, H. Meyer, E. Stolk, & E. Tan (Eds.), Complexity Theories of Cities Have Come of Age: An Overview with Implications to Urban Planning and Design (pp. 191–205). Springer. https://doi.org/10.1007/978-3-642-24544-2_11
McCormick, K., Anderberg, S., Coenen, L., & Neij, L. (2013). Advancing sustainable urban transformation. Journal of Cleaner Production, 50, 1–11. https://doi.org/10.1016/j.jclepro.2013.01.003
Article Google Scholar
Meerow, S., & Newell, J. P. (2019). Urban resilience for whom, what, when, where, and why? Urban Geography, 40(3), 309–329. https://doi.org/10.1080/02723638.2016.1206395
Article Google Scholar
Meerow, S., Pajouhesh, P., & Miller, T. R. (2019). Social equity in urban resilience planning. Local Environment, 24(9), 793–808. https://doi.org/10.1080/13549839.2019.1645103
Article Google Scholar
Merow, C., Serra-Diaz, J. M., Enquist, B. J., & Wilson, A. M. (2023). AI chatbots can boost scientific coding. Nature Ecology & Evolution, 1–3. https://doi.org/10.1038/s41559-023-02063-3
Min, B., Ross, H., Sulem, E., Veyseh, A. P. B., Nguyen, T. H., Sainz, O., Agirre, E., Heinz, I., & Roth, D. (2021). Recent Advances in Natural Language Processing via Large Pre-Trained Language Models: A Survey. https://doi.org/10.48550/ARXIV.2111.01243
OpenAI. (2022). Introducing ChatGPT. https://openai.com/blog/chatgpt
OpenAI. (2023). GPT-4 Technical Report. http://arxiv.org/abs/2303.08774
Owens, B. (2023). How Nature readers are using ChatGPT. Nature, 615(7950), 20–20. https://doi.org/10.1038/d41586-023-00500-8
Article ADS CAS PubMed Google Scholar
Qiu, X., Sun, T., Xu, Y., Shao, Y., Dai, N., & Huang, X. (2020). Pre-trained models for natural language processing: A survey. Science China Technological Sciences, 63(10), 1872–1897. https://doi.org/10.1007/s11431-020-1647-3
Article ADS Google Scholar
Salganik, M. J. (2018). Bit By Bit: Social Research in the Digital Age. Princeton University Press. https://www.bitbybitbook.com/en/preface/
Sanderson, K. (2023). GPT-4 is here: What scientists think. Nature, 615(7954), 773–773. https://doi.org/10.1038/d41586-023-00816-5
Article ADS CAS PubMed Google Scholar
Sharifi, A., Khavarian-Garmsir, A. R., Allam, Z., & Asadzadeh, A. (2023). Progress and prospects in planning: A bibliometric review of literature in Urban Studies and Regional and Urban Planning, 1956–2022. Progress in Planning, 100740. https://doi.org/10.1016/j.progress.2023.100740
Sobania, D., Briesch, M., Hanna, C., & Petke, J. (2023, January 20). An Analysis of the Automatic Bug Fixing Performance of ChatGPT. http://arxiv.org/abs/2301.08653
Teubner, T., Flath, C. M., Weinhardt, C., van der Aalst, W., Hinz, O.Welcome to the Era of ChatGPT, et al. (2023). Business & Information Systems Engineering, 65(2), 95–101. https://doi.org/10.1007/s12599-023-00795-x
Article Google Scholar
van Dis, E. A. M., Bollen, J., Zuidema, W., van Rooij, R., & Bockting, C. L. (2023). ChatGPT: Five priorities for research. Nature, 614(7947), 224–226. https://doi.org/10.1038/d41586-023-00288-7
Article ADS CAS PubMed Google Scholar
Verhoeven, F., Wendling, D., & Prati, C. (2023). ChatGPT: When artificial intelligence replaces the rheumatologist in medical writing. Annals of the Rheumatic Diseases, ard-2023–223936. https://doi.org/10.1136/ard-2023-223936
Wang, F., Yang, J., Wang, X., Li, J., & Han, Q.-L. (2023). Chat with ChatGPT on Industry 5.0:Learning and Decision-Making for Intelligent Industries. IEEE/CAA Journal of Automatica Sinica, 10(4), 831–834. https://kns.cnki.net/kcms2/article/abstract?v=3uoqIhG8C44YLTlOAiTRKu87-SJxoEJu6LL9TJzd50nxqpCV5Tz7Jf3mUPQQ3zwo4lgGbeTaJLLzAwF_KiWkHB-qzY7Z0OIS&uniplatform=NZKPT
Wang, C., & Yin, L. (2023). Defining Urban Big Data in Urban Planning: Literature Review. Journal of Urban Planning and Development, 149(1), 04022044. https://doi.org/10.1061/(ASCE)UP.1943-5444.0000896
Article Google Scholar
Wang, F., Li, J., Qin, R., Zhu, J., Mo, H., & Hu, B. (2023). ChatGPT for Computational Social Systems: From Conversational Applications to Human-Oriented Operating Systems. IEEE Transactions on Computational Social Systems, 10(2), 414–425. https://doi.org/10.1109/TCSS.2023.3252679
Article Google Scholar
Wang, J., & Biljecki, F. (2022). Unsupervised machine learning in urban studies: A systematic review of applications. Cities, 129, 103925. https://doi.org/10.1016/j.cities.2022.103925
Article Google Scholar
Webb, R., Bai, X., Smith, M. S., Costanza, R., Griggs, D., Moglia, M., Neuman, M., Newman, P., Newton, P., Norman, B., Ryan, C., Schandl, H., Steffen, W., Tapper, N., & Thomson, G. (2018). Sustainable urban systems: Co-design and framing for transformation. Ambio, 47(1), 57–77. https://doi.org/10.1007/s13280-017-0934-6
Article ADS CAS PubMed Google Scholar
Wu, T., He, S., Liu, J., Sun, S., Liu, K., Han, Q.-L., & Tang, Y. (2023). A Brief Overview of ChatGPT: The History, Status Quo and Potential Future Development. IEEE/CAA Journal of Automatica Sinica, 10(5), 1122–1136. https://doi.org/10.1109/JAS.2023.123618
Article Google Scholar
Yang, J., Jin, H., Tang, R., Han, X., Feng, Q., Jiang, H., Yin, B., & Hu, X. (2023). Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond. http://arxiv.org/abs/2304.13712
Yang, H. (2023). How I use ChatGPT responsibly in my teaching. Nature. https://doi.org/10.1038/d41586-023-01026-9
Article PubMed PubMed Central Google Scholar
Yigitcanlar, T., & Cugurullo, F. (2020). The Sustainability of Artificial Intelligence: An Urbanistic Viewpoint from the Lens of Smart and Sustainable Cities. Sustainability, 12(20), Article 20. https://doi.org/10.3390/su12208548
Zarocostas, J. (2020). How to fight an infodemic. The Lancet, 395(10225), 676. https://doi.org/10.1016/S0140-6736(20)30461-X
Article CAS Google Scholar
Zhang B., Ding D., & Jing L. (2022, December 30). How would Stance Detection Techniques Evolve after the Launch of ChatGPT? arXiv.org. https://arxiv.org/abs/2212.14548v3
Zheng, O., Abdel-Aty, M., Wang, D., Wang, Z., & Ding, S. (2023, March 21). ChatGPT Is on the Horizon: Could a Large Language Model Be All We Need for Intelligent Transportation? http://arxiv.org/abs/2303.05382
Zhu, J.-J., Jiang, J., Yang, M., & Ren, Z. J. (2023). ChatGPT and Environmental Research. Environmental Science & Technology. https://doi.org/10.1021/acs.est.3c01818
Article Google Scholar

Download references

Acknowledgements

This research is supported by the Center for Balance Architecture of Zhejiang University (Project No: K Heng 20203512-02B, Index and planning methods of resilient cities).

Funding

The Center for Balance Architecture of Zhejiang University,K Heng 20203512-02B,Haoying Han

Author information

Authors and Affiliations

College of Civil Engineering and Architecture, Zhejiang University, Hangzhou, 310058, Zhejiang, China
Jiayi Fu, Haoying Han & Xing Su
Faculty of Innovation and Design, City University of Macau, Macau, 999078, China
Haoying Han
School of Civil and Environmental Engineering and Earth Sciences, Clemson University, Clemson, SC, 29634, USA
Chao Fan

Authors

Jiayi Fu
View author publications
You can also search for this author in PubMed Google Scholar
Haoying Han
View author publications
You can also search for this author in PubMed Google Scholar
Xing Su
View author publications
You can also search for this author in PubMed Google Scholar
Chao Fan
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Jiayi Fu: conceptualization of this study, methodology, data curation, writing—original draft preparation, software. Haoying Han: data curation, revising the draft. Xing Su: data curation, revising the draft. Chao Fan: conceptualization of this study, methodology, writing & revising—original draft preparations.

Corresponding authors

Correspondence to Haoying Han or Chao Fan.

Ethics declarations

Ethics approval and consent to participate

Not applicable. All authors of this article declare they consent to participate.

Consent for publication

All authors of this article declare they consent to publication.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Fu, J., Han, H., Su, X. et al. Towards human-AI collaborative urban science research enabled by pre-trained large language models. Urban Info 3, 8 (2024). https://doi.org/10.1007/s44212-024-00042-y

Download citation

Received: 24 September 2023
Revised: 17 December 2023
Accepted: 08 February 2024
Published: 29 April 2024
DOI: https://doi.org/10.1007/s44212-024-00042-y

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Towards human-AI collaborative urban science research enabled by pre-trained large language models

Abstract

Similar content being viewed by others

Interpreting the Smart City Through Topic Modeling

Sailing the Data Sea to Advance Research on the Sustainable Development Goals

Integrative urban AI to expand coverage, access, and equity of urban data

1 Introduction

2 Opportunities

2.1 Urban institution

2.2 Urban space

2.3 Urban information

2.4 Citizen behaviors

3 Challenges

3.1 Technical perspective

3.1.1 Technical restrictions

3.1.2 Authenticity and validity

3.1.3 Comprehension skills

3.2 Sociological perspective

3.2.1 Lack of trust

3.2.2 Social bias and discrimination

3.2.3 Threat to information safety

4 Future directions

5 Conclusion

Availability of data and materials

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Ethics approval and consent to participate

Consent for publication

Competing interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation