Landscape sustainability science (LSS) focuses on linkages and interactions among landscape pattern, biodiversity, ecosystem services and human wellbeing (Wu 2013). As an interdisciplinary and transdisciplinary research field, LSS increasingly calls for cross-disciplinary approaches, especially those integrating natural and social sciences (Wu 2021, 2022). Urban landscapes are social-ecological systems, and provide ideal places to develop cross-disciplinary approaches for integrated research across natural and social sciences (Pickett et al. 2016; Zhou et al. 2021). Urban ecology (UE) studies have grown rapidly over the past 25 years to understand the impact of urbanization and promote residents’ well-being. A recent review shows that the number of peer-reviewed publications on urban ecology has grown 15% annually in the past two decades (Chen and Huang 2020). Urban ecology pays attention to people in urban change, along with the expansion of urbanized land cover and change of its spatial form (Zhou et al. 2022). Indeed, the two major characteristics of urban ecology: spatial heterogeneity and the coupled social-ecological nature of the city often come in side by side in UE studies (Pickett et al. 2001). These studies view cities as coupled social-ecological systems and therefore require social data for analysis. Social data are critical for understanding how are people impacted by urbanization processes? Who receives the benefits and who bears the cost during the urbanization process? How do urban dwellers’ responses, decisions, or behaviors feed back to the system? Such questions are highly relevant to the core research agenda for LSS.

Answering these questions requires incorporating social dimension into UE studies and views cities as coupled social-ecological systems, with interactions between people and nature at multiple scales (Pickett et al. 2011). Our search for peer-reviewed publications with keywords of “city or urban” and “ecology” indicated that 39% studies considered social factors (Fig. 1; also see Supplementary Material—Appendix 1 and Table S1 in Supplementary Material—Appendix 2). However, this number varies significantly across countries and regions. For example, 46% of the UE studies conducted in the US included a social aspect. In contrast, less than 30% of China’s UE studies considered social factors (Fig. 1). We argue that the (un)availability of social data is a major reason for such an unbalanced situation, which is indispensable for integrative UE research.

Fig. 1
figure 1

Publications of urban ecological studies with/without a social dimension. Globally 39% published UE studies considered social factors. This proportion is higher in US (46%), and lower in China (29%). Studies of UE in US and China take 14% and 15% of the global total respectively

Importance of social data to urban ecology

We define social data as information about people. Social data describe people from a variety of aspects, such as demographic characteristics, preference or perception about certain issues (e.g., trees in parks or global warming), their willingness to take action (e.g., exercise or take public transit), and their behavior (Chen and Huang 2020). Social data enable researchers to capture social differentiation among people when studying coupled, urban socio-ecological systems (Pickett et al. 1997; Grimm et al. 2000; Redman et al. 2004; Singh et al. 2010; Collins et al. 2011; Miller et al. 2014).

Understanding social differentiation among urban residents is crucial in UE studies for at least two reasons. First, social differentiation provides clues for differences in how people make decisions affecting the environment. Second, social differentiation tells how vulnerable people are in the face of environmental challenges such as heatwaves, water scarcity, or hurricanes. Therefore, it is essential for urban ecologists to be able to describe social differentiation among residents and associate these groups with a variety of research topics such as the provision of urban greenspace (Wolch et al. 2014), heat vulnerability (Huang et al. 2011), and health consequences from air pollution (Huang et al. 2019).

Urban greenspace (UGS) refers to vegetation and open space in cities, which provides a variety of ecosystem services to residents, including places for recreation, exercise, restoration, and heat mitigation. We use UGS as an example to show how incorporating social data can expand the range of research topics, and provide some sample research questions that need social data to address (Fig. 2). Notice that only the first question focusing on features of UGS and its impact on biodiversity can be answered without social data. In addition to its spatial patterns and biodiversity, we would also want to know accessibility of UGS to different social groups (Rigolon 2016), factors affect people’s perceptions of a greenspace (Pietrzyk-Kaszynska et al. 2017), their willingness to visit (Tu et al. 2020), impacts on biodiversity from both visitors and UGS management (Chang and Lee 2016), and finally the health benefit of using greenspace (Liu et al. 2017). Answering these questions requires social data. With such information, researchers can go beyond the biophysical features or patterns of urban greenspace to a more complete picture of how it functions and its interactions in the coupled social-ecological system.

Fig. 2
figure 2

Examples of research questions addressed by different types of data

Different types of social data in urban ecological studies

Social data used in UE studies can vary substantially from case to case. We summarize commonly used social data into three categories. The first type describes people’s demographic-socioeconomic characteristics. This type of data includes the number of people age, gender, educational attainment, income, marital status, employment, race, etc. The second type focuses on how people think, often described as their perception about certain issues or their willingness to take certain actions. Examples include visitors’ perceptions of biodiversity in parks (Muratet et al. 2015), and peoples’ willingness to pay for new parks (Andrews et al. 2017). The third type is related to behaviors. Usually measured are those behaviors with direct environmental consequence, such as how often residents fertilize their lawns and how much fertilizers they use (Zhou et al. 2008).

These social data mostly come from three sources: statistics compiled and published by government, commercial datasets, and investigations or surveys conducted by researchers (Chen and Huang 2020). Availability varies among different sources as well as in different places. Next, we explain each source in more detail, using examples from US and China for comparation.

Government statistical data

Government statistical data consists mostly of demography and socioeconomics, with some information on people’s behaviors. For example, the USA Census asks about how people commute to work. The decennial census conducted by the USA federal government is collected at relatively fine scales organized in block group of roughly 400 households on average. The census enumerates the number of residents, their gender, age, race and ethnicity, ownership versus rental of the dwelling, and relationships of the members of the household. More aggregated, community-based statistics on, for example, income, poverty, employment, marital or partnership status, housing value, national origin, business statistics, and other detailed information is collected every three to five years through the American Community Survey (data.census.gov),. These data are readily available without charge from the Census Bureau website (http://data.census.gov).

The situation for government statistical data in China is quite different. Coarse scale, limited information, and inconsistent availability all pose obstacles for researchers to use government statistical data in China. First, census data are usually collected by the lowest administrative unit, which can encompass an entire town or be the equivalent of a district in a large city (Zhang et al. 2020; National Bureau of Statistics 2021). Second, census data include such things as age, gender, educational attainment, residential registry situation (known as hukou in Chinese), and types of housing. Unfortunately, household income, which is commonly used to describe people’s economic situation, is not collected at the finest level of census. This is likely because census data was designed in 1950’s when income was not an important indicator to differentiate people’s social status in China (Xie and Zhou 2014). Instead, indicators such as hukou registry and number of bedrooms captured more differences among people than income at that time. Currently, however, these indicators often fail to capture the relevant differences among people as urban and consumption-based lifestyles become more prevalent (Wu 2019).

Finally, data availability varies across cities and regions. While some cities (such as Beijing) may have census data readily downloadable for free, in many cases, data are only directly accessible from a county or district office (Li and Liu 2016). Therefore, if researchers investigate the relationship between educational attainment and greenspace use in a city, they need to contact (or sometimes visit) the Department of Statistic in each district of that city for education data. The number of districts in a city varies from several to over a dozen.

Commercial data

Private sources of data relevant for UE studies go beyond the basics of demographics, education, and income to include information on consumption and lifestyle (Grove et al. 2014). In the US, information on consumption patterns by census block or postal zones are collected by retailers, and this information is aggregated into lifestyle clusters (e.g., Claritas© or Tapestry©). In addition, private firms organize and curate census data in ways convenient to users for a fee. In contrast, there are few commercially available data sources in China. Researchers may purchase some datasets from enterprises for analysis such as comments or posts from social media, place of interest, and mobile app use (Xiao et al. 2019). However, these datasets are often generated as a “by-product” from their main business, containing limited information and often need extra work to organize before they can be used (Zhang and Zhou 2018). A particularly serious gap in China is the absence of data that indexes consumption and lifestyle characteristics, given its rapid economic growth and urbanization process since the economic reform in 1978 (Chen et al. 2016).

Survey

Survey usually meets the needs of a study better than the statistics and commercial datasets in terms of the selection of indicators, but demands much more time, knowledge, labor, and financial resources. It is often the last choice for obtaining social data when relevant information is not available or accessible from government statistics or commercial datasets. When studying a large population, it can be challenging to have a representative sample size.

Limited social data hamper integrative urban ecological research

It is important to recognize the dependence of urban ecologists on social data to advance social-ecological understanding and applications. Lack of adequate, good quality, or accessible social data in certain countries or regions may further enlarge the existing gap in integrative urban studies and the ability to promote human well-being and urban resilience. For example, UE studies in China account for 15% of the world's output with a growth rate nearly twice the global rate measured by the number of peer-reviewed publications (China 27% vs. World 15%) (Supplementary Material—Appendix 3). Due to a lack of social data, however, the number of publications with a social dimension in China has been growing at a slower pace (Fig. S1 in Supplementary Material—Appendix 3). If these trends continue, we will see more UE studies from China but less emphasis on the social dimensions and the ability to inform understanding and decision making.

We compared UE studies of Baltimore, US and Beijing, China for inclusion of social data (Table S2 in Supplementary Material—Appendix 2). Both cities have well-established, long term ecological study sites. We examined peer-reviewed publications between the year the research site was established (1997 for Baltimore and 2008 for Beijing) to 2019 and our search identified 18 and 23 publications conducted in Baltimore and Beijing respectively. We found that studies in Beijing relied heavily on individual surveys (e.g., direct observation, questionnaire survey, and interview) to obtain social data. Only one study in Beijing used an education indicator from a government statistical database. In contrast, government statistical data was much more commonly used in Baltimore, where seven studies used indicators from seven different government statistical databases. In addition, one study in Beijing also employed “big data” from social media in an attempt to acquire wide-ranging data at relatively low cost. Disadvantage in accessibility and quality of social data in other places (often developing countries) may also discourage researchers to conduct integrative urban studies, which is badly needed to understand their unique urbanization trajectories (McHale et al. 2015).

Call for actions

While data availability poses an obvious obstacle to integrate social data into UE studies, we urge researchers to incorporate social data in UE studies, especially in places that have been less studied. There are always opportunities to overcome the present challenges. First, we recommend that governments make existing statistical data more accessible to academic communities. If resource allows, we encourage governments to review and update census data to capture social differentiation. For example, considering the rapid economic growth in China, household income is an important indicator that shall be included in census information. Second, we suggest that researchers share the data from their own work, regardless of its source: individual surveys or a collection of existing government datasets. Without data documentation and data-sharing, much of the precious resource (i.e., time, labor) is wasted in repetitive work of data collecting, obtaining, and cleaning. Creative ways are needed to give appropriate credits to encourage researchers to share datasets. Last, big data from social media and other platforms is becoming a new and promising source for social data. Big data may reveal information that has seldom been captured before, expose much larger scales, elicit much lower costs, and especially, may address places where traditional datasets from government and commercial sources are limited. The scale, coverage, access, content, and cost of big data vary substantially in different countries and regions, which is beyond the scope of this paper. It also has some limitations that have been recognized (Ilieva and McPhearson 2018). Nevertheless, it provides an important source to collect information about people. In addition, the rise of technology and social media also makes it easier for some traditional ways to generate social data. For example, apps in smartphones make it easier to automatically link citizen science observations to locations in contrast to uploading data to a designated webpage.

In conclusion, it is important to recognize the dependence of urban ecologists on social data to advance social-ecological understanding and applications. Future research could comparatively evaluate the necessary social-ecological data informatics systems to advance actionable science, knowledge, and applications for this 21st urban century. We urge practitioners and researchers to work together to improve data integration for UE research through interdisciplinary collaborations, data sharing, and the utilization of big data, especially in places that have been less studied. Such actions advance the social-ecological understanding of urban landscapes andfacilitate the development of cross-disciplinary approaches for integrated research across natural and social sciences to advance LSS.