1 Introduction

The world's population has grown significantly in recent decades and is projected to continue increasing, with a substantial portion residing in urban areas (UNDESA Population Division, 2019). This trend is particularly evident in densely populated regions where the demand for housing and land is rising (Duin et al., 2018; Groenemeijer et al., 2020). The current intensive land use already poses pressure on the green and water infrastructure and biodiversity, therefore dealing with these threats requires safeguarding natural and rural areas. As a result, the scarcity of available land for city growth is prompting a shift towards densifying existing cities.

The proposed solution of densification, on the other hand, results in challenges to maintain and improve the livability of urban areas, which encompasses aspects such as human well-being (Nabielek et al., 2012). In the existing literature, the relationship between urban design practices and the well-being of inhabitants has already been firmly established and examined (Martin and March, 1972, Lynch, 1972, Fisher-Gewirtzman, 2017, Weijs-Perrée et al., 2019). In recent years, street view imagery has been utilized as an efficient source of data to understand the impact of various built environment characteristics on human perception (i.e., Biljecki and Ito, 2021; Liang et al., 2023). Compared to text-based surveys, surveys that utilize street view imagery datasets provide (i) accurate visual data to avoid biases based on the respondents’ imagination and interpretation of the text, (ii) fine spatial coverage, and (iii) understanding of people’s perception based on large data sets (Dubey et al., 2016).

The built environment characteristics such as the amount of green and water elements (Van Vliet et al., 2021), the height of buildings, and the size of open spaces (Karimimoshaver and Winkemann, 2018; Joglekar et al., 2020), particularly in densely populated areas, play a critical role for increasing livability and supporting the well-being of inhabitants (Bardhan et al., 2015) as such attributes influence how people perceive their environment in terms of safety, beauty, liveliness, etc. Poorly designed dense areas can lead to negative perceptions such as noise, dirt, and lack of safety, and therefore they can reduce the subjective well-being of inhabitants. Merely densifying by constructing new neighborhoods is therefore insufficient. Urban design practices that explicitly acknowledge and enhance inhabitants’ well-being should be therefore emphasized (Haifler and Fisher-Gewirtzman, 2022).

The design and development of urban areas typically involve a range of stakeholders, such as planners, designers, developers, communities, and public authorities. In recent years, there has been a shift in responsibilities from public authorities to the market parties such as real estate and urban developers, thereby making private parties undertake a leading role in the urban area development process while ensuring public interests are upheld (Heurkens, 2012). However, public interests such as inhabitants’ well-being might not be a core interest for private parties. Therefore, at the initial design phase of urban areas, various stakeholders should work together to set a shared vision and objectives. In order to navigate the complexities and responsibilities associated with urban design and development while maintaining or increasing the livability of urban areas, there is a need for supportive tools that provide insights into the effects of urban design decisions on the overall human perception and therefore their well-being.

A branch of such supportive digital tools is generally classified as Computational Urban Design (CUD) tools. They allow for a shift of the traditional design process from designing geometries to designing based on shared-decision variables and desired outcomes. Instead of manually drawing structures, a CUD tool generates or adjusts geometries based on an algorithm where dimensions, volumes, design-decision variables and outcomes are set either as design constraints or as targets. The values of design variables are derived from the analysis of the existing urban system and are therefore based on evidence. The resulting design outputs provide various scenarios to understand the impact of design decisions on the overall urban system. CUD tools can rapidly generate designs based on data, providing an overview of potential design scenarios and their consequences. This makes CUD tools particularly effective in the early stages of urban area design.

Current CUD tools lack the ability to include human perception of the built environment, especially in terms of the influence of design decisions (i.e., the height of buildings, or the amount of green) on people’s well-being. While the impact of the built environment characteristics on human perception (i.e., safety, beauty, liveliness) is acknowledged in urban design and planning studies (Weijs-Perrée et al., 2020, Birenboim, 2018), CUD tools currently do not incorporate this aspect. In this article, we explore how computational urban design can support the early design phase of an urban area by incorporating human perception to design a liveable dense area that will eventually support people’s well-being. In order to do that, we will first investigate the influence of volumetric built environment characteristics on the perception of safety, liveliness, and beauty as indicators of subjective well-being. This will be done by analyzing a large dataset that consists of indicated perceptions of street view images. The findings of this analysis will then represent the input for a CUD tool that can computationally generate an optimal urban neighborhood design to maximally align to perceived beauty, liveliness, and safety, as an output that can support the subjective well-being of inhabitants.

The remainder of the article is structured as follows. First, in Sect. 2, we describe the previous relevant work, then in Sect. 3 we introduce the developed methodology and describe the required intermediate steps. Section 4 provides some details on the implementation and shows some results obtained with different optimization strategies. Finally, Sect. 5 contains the final remarks as well as some reflections for future improvements.

2 Related work

This section will present some relevant literature related to human perception (perceived beauty, liveliness, and safety) in the built environment and the measurement of human perception through the help of street view imagery. Next, we will discuss the literature on the intersection of human perception and computational urban design.

2.1 Human perception in relation to the built environment

It is widely recognized that well-being is closely linked to the built environment (Fathi et al., 2020). One crucial aspect influencing well-being is how individuals perceive their surroundings in the built environment (Mouratidis, 2021; Smith et al., 2015). Studies (i.e., Birenboim, 2018; Weijs-Perrée et al., 2020) have examined how various characteristics of the built environment affect people's perception in terms of safety, liveliness, and beauty, which then stimulate the positive and negative emotional states of individuals such as happiness, sense of security, comfort, and annoyance (Dane et al., 2019). In these studies, emotional states are found to have an impact on both the momentary and long-term subjective well-being of individuals.

Various built environment characteristics influence the perception of beauty, liveliness, and safety. In this study, we aim to explore the incorporation of human perception into computational urban design, therefore we will focus on the volumetric elements, such as the main shape of the buildings, street, and trees, and exclude the non-volumetric elements such as building function and façade objects.

Regarding perceived beauty, in the skyline, landmarks, and tall buildings are in general perceived as beautiful (Quercia et al., 2014a, Karimimoshaver and Winkemann, 2018), while this perception changes at the street (eye) level, and in general, at the street level, the presence of buildings negatively influences perceived beauty (Rossetti et al., 2019). It can be concluded that tall buildings and landmarks in the skyline can have a positive influence on the perceived beauty of a skyline view. At street level, the presence of any type of building generally negatively influences perceived beauty. In terms of vegetation, greenery contributes positively to perceived beauty, with the amount of greenery being the most influential factor (Joglekar et al., 2020; Quercia et al., 2014a; Rossetti et al., 2019; Weber et al., 2008; Zhang et al., 2018). Gardens, trees, and grass are associated with beautiful street scenes (Joglekar et al., 2020; Zhang et al., 2018). Buildings with incorporated vegetation are generally preferred over those without incorporated vegetation such as green facades (White and Gatersleben, 2011). Broader streets are negatively related to the perception of beauty, while small paths are positively related (Joglekar et al., 2020). Less sky view in the street scene is associated with more beautiful scenes (Joglekar et al., 2020; Rossetti et al., 2019). Sense of order and uniform arrangement in the urban form are considered aesthetically pleasing, with low to medium complexity being associated with beautiful scenes (Weber et al., 2008; Karimi, 2012; Joglekar et al., 2020).

With respect to perceived liveliness, greenery is found to have a negative influence on liveliness, while infrastructure and vehicles have a positive influence (Zhang et al., 2018., Verma et al., 2020). However, these conclusions are based on studies that include both urban and rural street views, with rural environments typically having more greenery (Dubey et al., 2016). To obtain a more accurate understanding of the relationship between greenery and perceived liveliness, it would be beneficial to focus solely on urban street views. For instance, trees can enhance detailing and shading in streets and can positively affect perceived liveliness (Mehta, 2007).

Although crowd density is also found as one of the significant indicators of perceived liveliness, the direct link between crowd density and volumetric built-environment characteristics such as greenness, openness, enclosure, walkability, and imageability is not straightforward (Tao et al., 2022). For instance, walkability and dense road networks are positively associated with higher concentrations of people on the street (Zhang et al., 2019). The subdivision of building masses into visually distinctive segments or parts significantly influences the pedestrians' visual engagement. The presence of more plinths, defined as distinct segments in the building mass, leads to longer visual engagement with the ground floor of the building mass (Simpson et al., 2022) and might result in livelier perceived environments.

Finally, in terms of perceived safety, the presence of buildings in the street view generally has a negative influence, while the presence of greenery, specifically trees, and grass, has a positive influence (Jansson, 2019; Harvey et al., 2015; Mouratidis, 2019b; Zhang et al., 2018). Vegetation taller than 2.5 m is found to positively affect perceived safety (Li et al., 2015). Additionally, the presence of sidewalks, roads, and paths, as well as the separation of walking infrastructure from the road and the width of the sidewalk, are associated with higher perceived safety (Zhang et al., 2018; Kweon et al., 2004; Al Mushayt et al., 2021).

The urban form also plays a role in perceived safety, with open spaces, clear sightlines, and refuges being positively related to perceived safety (Jansson, 2019; Rahm et al., 2021; Loewen et al., 1993). Moreover, the subdivision of the built environment and the presence of individual buildings are positively associated with perceived safety (Harvey et al., 2015). Furthermore, the ratio of building height to street width is found to be significantly related to perceived safety, with a higher ratio indicating a greater sense of enclosure and increased perceived safety (Stamps, 2005; Alkhresheh, 2007; Harvey et al., 2015). A visible horizon, depth of the street, open sides, and visible sky have a negative influence on the feeling of enclosure and, subsequently, perceived safety (Harvey et al., 2015). The street width and building height, as well as density and street-wall continuity, do not significantly influence perceived safety (Mouratidis, 2019).

Overall, the existing literature mentions various built environment elements to influence perceived beauty, safety, and liveliness. The volumetric built environment elements that influence human perception can be categorized as (i) building (i.e. building height, visibility); (ii) vegetation (i.e., amount of visible greenery), (iii) street (i.e., width of streets and sidewalks, visibility of roads) and (iv) urban morphology (i.e., order/uniformity, density of street networks).

To measure human behavior, experiences, and perception in the built environment through surveys, various methods can be employed, such as virtual environments (Evers et al., 2023; Birenboim et al., 2021; Echevarria Sanchez et al., 2017; Johnson et al., 2010; Lee and Kim, 2021; Leite et al., 2019; Lu et al., 2021), real images and videos (Alhasoun and Gonzalez, 2019; Chen et al., 2022; Ye et al., 2019), and the tracking of real behavior (Al Mushayt et al., 2021; Batool et al., 2021; Liu et al., 2021).

When studying perception rather than experience or behavior, images of the built environment are often a cost-effective and accurate means of measuring human perception. In perception studies, realistic images such as street view imagery are preferred to minimize the difference between actual and visualized environments. The widespread availability of street view images over the past decade has provided researchers with abundant visual data on existing environments that can be presented to respondents in a cost-effective manner. The use of street view images has yielded meaningful and interesting results (Biljecki and Ito, 2021). However, street view images alone may not be sufficient for understanding human perception, as input from individuals indicating their perceptions and preferences is necessary.

The Place Pulse 2.0 dataset has been widely used in the literature to study the relationship between the built environment and human perception (Dubey et al., 2016). This dataset consists of a large collection of street view images, which have been used to train deep learning models (Zhang et al., 2018). These street view images were presented to respondents for pairwise comparisons regarding perceptions of safety, liveliness, and beauty. However, in the study from Zhang et al. (2018), the deep learning model predicting human perception scores of new images used only image segmentation data, considering only the percentages of major built environment elements visible in the images. It did not include other characteristics found in the literature, such as absolute height and distance values. Given the size and availability of the Place Pulse 2.0 dataset and the usefulness of street view imagery for studying human perception, it is a valuable resource for the scope of this current research. Additionally, using image segmentation to extract analyzable data from street view images provides valuable insights into human perception (Zhang et al., 2018), although it may not capture all relevant built environment elements.

2.2 Human perception and computational urban design

Recent studies have highlighted the usefulness of Computational Urban Design (CUD) in supporting the development of urban areas (Çalışkan, 2017; Fusero et al., 2013; Nagy et al., 2018; Steinø et al., 2013; Y. Zhang and Liu, 2021). CUD has proven to be effective in generating conceptual designs quickly while considering the different interests of stakeholders and the complexities of existing urban environments (Steinø et al., 2013; Nagy et al., 2018; Wilson et al., 2019).

As can be seen in Table 1, recent existing literature on CUD that pertains to densification and optimization problems, often focuses on specific design problems (i.e., mobility, economical/financial and environmental sustainability, adaptive master plan) and fails to address the full range of disciplines in urban design. While CUD is acknowledged as a valuable tool for stakeholder involvement and management, CUD tools (i) do not include human perception in design problems and (ii) rarely incorporate comprehensive sets of key performance indicators (KPIs) (Nagy et al., 2018). This lack of inclusiveness currently limits the effectiveness and scope of CUD tools.

Table 1 Main application topics found in the literature on computational urban design

To address these limitations, future research should aim to develop CUD tools that incorporate multiple KPIs (i.e.; subjective well-being) and consider human perception of the built environment. By broadening the scope and comprehensiveness of these tools, their effectiveness in supporting urban design processes can be enhanced.

As the CUD is a data-driven process, it is important to generalize and quantify human perception in order to incorporate it into CUD effectively. Nonetheless, relying solely on generalized subjective relations can lead to overly generic designs that may have negative consequences for human well-being (Altomonte et al., 2020). Creating generic designs and retrieving fast insights into design options are especially advantageous when a CUD tool is used during the exploration and vision-setting phase of urban design. In summary, quantifying human perception, generalizing its influence, and considering the level of detail in design phases are all crucial aspects to be addressed in the context of CUD.

3 Methods

This section describes the data and the methodology used for this study. Firstly, we explain the use of the Place Pulse 2.0 dataset and the retrieved auxiliary built environment data from other resources in order to explain the relation between the volumetric built environment and human perception, which was done by means of multinomial logit analysis. Further, it is explained how these findings were utilized as input for and integrated into a CUD tool.

3.1 The influence of the volumetric built environment characteristics on human perception: Analyzing the street view images based survey

3.1.1 Data and data processing

The aim of this step is to describe the relationship between perceived beauty, liveliness, and safety, and the built environment characteristics. Moreover, the built environment characteristicsshould represent the volumetric built environment elements and the data should allow these elements to be quantified and measured. To achieve this, a big data approach has been employed, initially using street view imagery and human perception choices from the Place Pulse 2.0 dataset (Dubey et al., 2016). Place Pulse 2.0 contains over 110,000 Google Streetview images along with one metadata dataset containing 1,223,649 choices of people between two images based on their preference in relation to human perception categories. The choice data of the MIT Place Pulse 2.0 dataset was gathered in 2016 using crowdsourcing, via a website where a pairwise comparison of street photos was presented to respondents. In total, 81,630 individuals responded to the survey. For each pairwise comparison, a respondent had to choose an image perceived as the more beautiful, lively, safe, etc., as can be seen in Fig. 1. The dataset does not contain any data about respondent characteristics, so the choice of a respondent cannot be related to the personal characteristics or experiences of a respondent.

Fig. 1
figure 1

Example of a choice that a respondent had to make between two street view images according to the perception of safety (Dubey et al., 2016)

In our study, the images in the dataset were segmented to extract the volumetric built environment data visible in the images, while the location data of these images was used to gather the extra volumetric built environment data surrounding the image locations. In order to calculate the shares of the volumetric built environment element in the images, image segmentation techniques were used. More specifically, the PSPnet-50 model pre-trained on the Ade20k dataset (Zhao et al., 2017), was applied to every image in the Place Pulse 2.0 dataset. This pre-trained model was selected based on the following criteria. First of all, the model should be able to find the relevant built environment elements in the image, namely: buildings, roads, trees, and sky, as accurately as possible. The study of Zhang et al. (2021) which compares several image segmentation algorithms on street view images, shows that PSPnet performs better than other algorithms to recognize the built environment elements. The second criterion was that the application of the model should be well-documented and open source. Using PyTorch and Google Colab, the pre-trained PSPnet50-Ade20k model was applied to all 110,998 images. Using the segmented image data, the share of the relevant built environment elements was calculated for every image. These shares are used as attributes in the analysis. Figure 2 shows some examples of segmented images.

Fig. 2
figure 2

Examples of segmented images

The Place Pulse 2.0 dataset contains images taken both in urban and non-urban areas. However, this dataset also contains images that are fully focused on one building or one row of buildings. Therefore, such images do not fully represent an urban streetscape. In addition, the images taken on highways can be classified as non-urban streetscapes. In this current study, such images were filtered out from the dataset by looking at the calculated shares of built environment elements in the images as a result of the image segmentation. Therefore, the pairwise comparisons that included such an image were also excluded from our dataset.

For every image, the geographical position where it was taken is known in the dataset as latitude and longitude coordinates. This means that, in addition to the segmented data of the image, more elements can be gathered using auxiliary data from the built environment around the position of the image. These auxiliary datasets can be categorized as datasets containing building data (OpenStreetMap contributors, 2021) and datasets containing street network data (Boeing, 2017). However, these auxiliary datasets are not available for every location. Therefore, we also filtered the images for which no auxiliary datasets were present.

As a result of the above-described filtering conditions (filtering images of non-urban areas and filtering images with non-available auxiliary datasets), 7,158 images from the Place Pulse 2.0 dataset were chosen, resulting in 6,522 choices that are used in this research.

For every remaining image, multiple independent attributes describing volumetric built environment elements were retrieved from the datasets. In addition to the retrieved data on the shares of built environment elements in the images, data on the building height of surrounding buildings, data on the footprint and volume of surrounding buildings, and data related to the street on which the image is taken were retrieved. A complete overview and description of the retrieval process of the volumetric built environment data from auxiliary datasets can be found in Van Veghel (2022). For example, the facade length index was derived by dividing the total length of the buildings adjacent to the street segment around the image by the total length of the street segment. The footprint area index is the share of the area of all building footprints within a 100-m buffer around the image. The volume index is the share of all building volumes within a computed volume around the image, with the volume around the image being a 100-m buffer around the image multiplied by a set reference height of 40 m. Furthermore, the number of street segments represented the number of segments in the dataset (Boeing, 2017) that were available within a buffer of 50 m of the images.

The conclusive list of volumetric built environment elements derived from segmented images of Place Pulse 2.0 Data and the auxiliary datasets can be found in Table 2.

Table 2 Volumetric built environment elements derived from street view imagery and open spatial datasets

3.1.2 Analysis results

For the data analysis, the dependent variable is the respondents’ choice between two street view images (perceived as safer/livelier/more beautiful). The independent variables in our analysis are the volumetric built environment elements derived from the segmentation of street view images and auxiliary datasets. As the dependent variable concerns the choice between two street view images, discrete choice models are appropriate models for analysing respondents’ preferences. The models are based on Random Utility theory which assumes that individuals choose the alternative that yields the highest utility (see e.g., Henscher, Rose, and Greene, 2015). An alternative’s utility consists of a structural part and a random part. According to the multinomial logit model (MNL), the structural part is a weighted summation of a selection of the built environment elements listed in Table 2. The probability that one of the two alternatives is chosen is equal to pi = exp(Vi)/(exp(V1) + exp(V2)), with Vi = Σk βkXik. Vi represents the structural utility of alternative i (i = 1,2); Xik is the value of the kth built environment element of alternative i and βk measures the impact of the kth built environment element. The βk’s are estimated by optimizing the probability of the chosen alternatives. The multinomial logit model is the most basic discrete choice model and has been applied to quantify the relationship between volumetric built environment elements and human perception by utilizing the selected choice data from the Place Pulse 2.0 dataset and the retrieved volumetric built environment data.

The dataset has been used both as a whole dataset and as split subsets, i.e. only for high-density and only for low-density areas. The Chi-square p-values for Likelihood Ratio Statistics (LRS) show that the performance of the models estimated on the split datasets is significantly better than the performance of the model estimated on the complete dataset (See Appendix A Table 6). In other words, there is a significant difference between the relationship between perceived safety, liveliness, and beauty and the volumetric built environment elements in higher-density environments and in lower-density environments. Moreover, the elements that have been found to have a significant influence on perception vary per dataset.

As this article focuses on densifying cities, we will look closer into the findings from the high-density dataset (in case the readers are interested in the comparison of high and low-density datasets, we refer them to Van Veghel (2022)). The significant attributes are the outputs of this first analysis step and will be the input for the following step of the CUD tool. The results of the estimations are presented in Table 3. In social sciences, the expected model fit is usually 0.2 or higher (Hensher and Greene, 2003). In this study, the relatively weak performance (expressed by McFadden’s rho square) of the MNL models for perceived beauty, liveliness, and safety highlights that the built environment elements have limited influence on human perception. This could indicate that considering only objective aspects such as built environment elements is not enough to understand human perception and it may require adding more person-related characteristics to the estimation (i.e.; socio-economic background, personality, and mood). However, the Place Pulse 2.0 dataset does not contain data on person-related characteristics.

Table 3 MNL estimation results

Regarding perceived safety and considering the high-density dataset, sky share, building share and area coefficient of variation have a negative influence on perceived safety, whereas height median and offset distance ratio have a positive influence. In terms of perceived liveliness, tree share, absolute height difference, and façade length index have a significant positive influence on perceived liveliness while offset distance height ratio has a negative influence. This set of attributes indicates that an environment with green and with buildings of varying heights and lengths is perceived as livelier. Regarding perceived beauty, the positive impact of tree share, along with the negative impact of offset distance height ratio, align with previous research highlighting the significance of enclosure-related elements in enhancing perceived beauty (Joglekar et al., 2020; Karimi, 2012; Rossetti et al., 2019; Weber et al., 2008). Additionally, the building height variation has a negative influence on perceived beauty which can be explained by previous findings that highlight the positive impact of uniformity on perceived beauty (Karimi, 2012). Overall, looking at the results for perceived beauty, liveliness and safety, the above analysis reveals the significant influence of trees, building composition, building height, and street width on human perception.

3.2 Incorporation of human perception into computational urban design

For the integration of the found relationships between human perception and the volumetric built environment elements into the design process of a high-density neighborhood, a CUD tool has been set up. It enables the analysis and optimization of a volumetric urban design also including human perception. The CUD tool has been developed starting from a set of already existing tools, originally developed at TU Delft and here simply called TUD-CUD for the sake of simplicity (García González, 2019; Agugiaro et al., 2020; Doan, 2021). The core of the tool is developed using Grasshopper for Rhinoceros and allows to integrate, via several interfaces, static GIS data. Existing buildings, roads, parcels, and vegetation in and surrounding the study area are imported from a semantic 3D city model, and can be used together with 3D data generated parametrically within Grasshopper, such as new buildings, new streets and new vegetation within the study area. The purpose of the tool is to support the designer in the definition of a number of alternative solutions, called scenarios, in which several volumetric designs are proposed in order to comply with existing planning regulations. Not only buildings but also streets are considered and adapted in the different output scenarios. The output of the TUD-CUD tool consists therefore in different combinations of two main urban object classes: building blocks and streets. The former can either take the shape of a solid block or of a solid block with a courtyard in it, while the latter is composed of strips, each having a function such as a pedestrian path, green strip, or road meant for cars. An example is provided in Fig. 3, while Fig. 4 provides an alternative, cross-section view of a resulting street strip.

Fig. 3
figure 3

Example of the Graphical User Interface of the TUD-CUD tool in Grasshopper/Rhinoceros. On the left, a scenario consists of buildings (represented as solid blocks or solid blocks with courtyards) and streets strips. On the right, the Grasshopper window with information panels and data nodes. [Image source: Agugiaro et al., 2020]

Fig. 4
figure 4

Section of possible layout for outdoor spaces resulting from the TUD-CUD tool and visualised in Grasshopper/Rhinoceros. Note that the trees are actually not modeled in Rhinoceros/Grasshopper. They are added here only for visualization purposes. [Image source: Agugiaro et al., 2020]

The TUD-CUD output represents the starting point for the integration of human perception within a CUD tool in order to optimize a high density urban area design scenario. Firstly, a design scenario generated through TUD-CUD, is imported into Grasshopper using the Urbano plugin (Dogan et al., 2020). For this part of the study, Grasshopper was chosen due to its well-known suitability for architecture-related procedural modeling capabilities. In order to allow for design optimization, the additional design variables in the created Grasshopper model have been set (as those listed in Table 4) so that they are aligned as much as possible to the volumetric built environment elements that were found to influence human perception in high-density areas (as in Table 3). Additionally, for realistic design scenarios, requirements that would be representative of a high-density neighborhood (in the Netherlands), have been included in the process, such as, for example, the maximum building height and the minimum required amount of square meters.

Table 4 Design variables to incorporate human perception in the CUD tool

The final set of design variables for a high-density neighborhood that is defined in Grasshopper can be subdivided into three main categories: i) building height, ii) building footprint, and iii) street perspective & typology. Every design variable can either be adjusted per building, for all buildings, for all streets, or for all plots. The plots will eventually define the footprint type (courtyard or solid) and size of a building. Table 4 presents the different design variables that were included in the CUD tool to incorporate human perception. There is no explicit design variable for the placement of trees. The design variable ‘vegetation strip outer width’ automatically defines the amount of trees that are generated in the design: if every street contains one green strip, then it contains fewer trees than two green strips. Intuitively, the wider the strip is, the more trees per strip will be added to the proposed design.

3.3 Evaluation of the human perception in the virtual urban scenarios

3.3.1 Taking virtual images

The first step is to evaluate and quantify the effect of a design scenario on human perception. For this, a similar approach is followed as described in Sect. 3.1 of this article. However, instead of analyzing “real” street view images as before, now the images are generated from the virtual urban scenario. In other words, sets of 3D isovists are generated automatically at selected positions in the streets of the virtual urban scenario, representing a set of views that a hypothetical person would experience when walking in that specific high-density urban area setting. For this reason, several points of analysis are generated in the design scenario. These points of analysis are comparable with the image locations mentioned before. Different sampling densities can be set for the points of analysis: very dense points will result in a more thorough and descriptive analysis, however at the cost of a longer processing time. An example is given in Fig. 5.

Fig. 5
figure 5

[Left] Schematic representation of isovists created at different locations along the street network. [Right] Schematic representation of field view used to compute the isovist

At each point of analysis, 3D isovists are generated from which virtual images are extracted. Furthermore, identically to the data-gathering process used for the analysis on the relation between the volumetric built environment and human perception (Sect. 3.1.), a buffer is generated around the point of analysis and around the street segment belonging to the specific point of analysis in order to retrieve the additional volumetric built environment data.

3.3.2 Scoring of virtual images

In Sect. 3.1 of this article, the analyses between human perception and the volumetric built environment in high-density urban areas have been done resulting in a set of estimates (see Table 3) associated with the relevant features. In the Grasshopper-based CUD tool developed for this current study, these estimates have been inserted in the Grasshopper-based CUD tool and, together with quantitative analysis from the isovists, they are used to compute the score of human perception categories (i.e. perceived beauty, liveliness and safety) at every point of analysis. Point-wise results are then aggregated at street element level, and up to the whole urban scenario level. The results can therefore also be visualized at different levels of spatial granularity: from the single point to the whole urban scene. The interactive visualization of the results helps the user, for example, to quickly identify weak and strong street segments. A schematic example is provided in Fig. 6. This drawing visually exemplifies the concept of isovists scores (represented here by points) aggregated at street level (represented here by means of lines). Growing values associated with the isovists and streets are represented here in simplified form by means of a red-yellow-green color scale, in which red represents low values, and green high values.

Fig. 6
figure 6

Schematic example of isovists scores (represented here by points) aggregated at street level (represented here by means of lines). Growing values associated with the isovists and streets are represented here in simplified form by means of a red-yellow-green color scale

3.3.3 Optimization

The procedure explained in the previous section describes how a single urban design scenario can be defined in order to quantify the influence of a high-density urban area design scenario on human perception. The CUD tool actually takes advantage of the intrinsic capabilities of the Grasshopper environment to dynamically change (interactively by the user, or automatically) the urban design scenario based on the values of design variables of a high-density urban area presented in Table 4. In addition, for each new configuration, the scores of perceived beauty, liveliness and safety can be computed automatically.

The next step is therefore to check whether and how a high-density urban area design scenario can be optimized based on human perception. The optimization can be carried out in different ways. First, singularly on each one of the three investigated categories of human perception (i.e. by means of a single-objective optimization), then altogether, via a multi-objective optimization. In Grasshopper, a single-objective optimization is carried out using the Galapagos plugin (Rutten, 2013). For multi-objective optimization, the Octopus plugin (Vierlinger et al., 2018) is used. The multi-objective optimization includes the principle of Pareto efficiency and provides a design output that cannot be further improved on one human perception category score without decreasing the human perception score of another category.

During the optimization process, the design requirements of a high-density urban area are of importance. Since the included relations that score the design on human perception are linear relations, optimized designs that do not have to meet any requirements can take extreme (and unrealistic) forms, for example, the height of the buildings could become extremely high. Additionally, the requirements incorporate other design objectives such as a minimum amount of to-be-realized square meters of floor space.

The design variables are therefore influenced by the design constraints. If a design variable is set to a level that causes a design constraint not to be met, then the design variable is adjusted accordingly. For example, a too-high value for the building height will lead to an excessive and unacceptable amount of floor space. Therefore, if a certain building is set to a height that results in too many square meters of floor space, the height of the other buildings is lowered in order to respect the constraint on the maximum floor area space. Van Veghel (2022) extensively describes how the design constraints are implemented during the optimization process and how they interact with the design variables.

4 Implementation based on a test scenario

The methodology presented in the previous section has been implemented and tested on a test scenario. The test scenario focuses on high-density urban area development as high-density urban environments are required to accommodate the increasing demand for urban living. For this reason, the usage of a CUD tool that generates design scenarios can contribute to understanding of the effects of different design choices on how these environments will be perceived by people. This test scenario consists of four plots and the adjacent streets, as well as the surrounding urban “context”. Figure 7 visualizes the implementation steps in the CUD tool.

Fig. 7
figure 7

Steps taken in the CUD tool

The resulting design in Grasshopper consists of core buildings and streets that can be modified as well as fixed surrounding buildings that are used to define the urban context. The urban context, i.e. the shape and position of buildings, cannot be modified by the Grasshopper-based CUD tool. In Fig. 8, the buildings that will be subject to optimization are depicted in color light azure, while the streets surrounding them can be easily recognized. The buildings representing the urban “context” are depicted in gray.

Fig. 8
figure 8

3D visualization of the test scenario used as a starting point to investigate the effect of human perception in the design process

The scores on perceived beauty, livability and safety are calculated when the design variables have been first initialized, i.e. have been assigned values computed from the scenario used as a starting point and their weights are based on the estimates coming from Table 3. Then, in a second moment, the scores are recomputed by applying optimization algorithms to the predefined test scenario. In the following sub-sections, the main findings will be presented and explained.

4.1 Single-objective optimization results

By means of a single-objective optimization, the high-density urban area design scenario shown in Fig. 8 has been optimized on perceived safety, liveliness and beauty, respectively. The following figures show the result of each optimization process and are provided with a short description of the findings.

In terms of perceived safety, the output of the single-objective optimization (shown in Fig. 9) results in large building blocks along the streets, the buildings are positioned at some distance from the streets and the green strips in the street layout are relatively wide, resulting in many trees. Compared with the reference design in Fig. 8, the core buildings (in light azure) and the streets have been modified. This result can be explained based on the design degrees of freedom, the design variables along with the design constraints, and the incorporated relationships that define the human perception score. The many trees as a result of the green strip width decrease the sky and building view as much as possible. The solid building mass additionally decreases the sky view as much as possible as building view has less negative influence than sky view on perceived safety. Furthermore, the buildings-offset from the street results in relatively high buildings due to the requirements in terms of required floor space area while it additionally decreases the offset distance height ratio variable. Overall, this is in line with some of the characteristics investigated before regarding the perceived safety in high-density urban areas.

Fig. 9
figure 9

Result of the single-objective optimization in terms of perceived safety

When it comes to perceived liveliness, the optimization results (as in Fig. 10) show a greater variation in building heights, in the distance between the buildings and buildings that are positioned close to the streets. In particular, the streets are narrowed down by removing the canal, but still including several green strips to maintain a large share of trees in the street view. As a result, the absolute height difference between the tallest and lowest building is large and the offset distance height ratio is relatively large due to the narrow streets. Tree share, absolute height differences and the façade length index positively affect the perceived liveliness, while the offset distance height ratio does not. This result also reflects the earlier findings regarding perceived liveliness in high-density urban areas.

Fig. 10
figure 10

Result of the single-objective optimization in terms of perceived liveliness

When optimizing in terms of perceived beauty, Fig. 11 shows that, compared with the reference design, in particular, one building has been removed to meet the design constraints. Furthermore, the street offset becomes relatively large, as well as the mean building height of the remaining buildings. The height difference between the buildings is relatively small and the green strips ensure a large amount of trees in the street view. This corresponds with some of the characteristics investigated before regarding the perceived safety, i.e. tree share positively affects the perceived beauty, while the height coefficient of variation has a negative effect on the perceived beauty in high-density urban areas.

Fig. 11
figure 11

Result of the single-objective optimization in terms of perceived beauty

4.2 Multi-objective optimization results

Finally, the same reference design is optimized using multi-objective optimization. The multi-objective optimization tries to improve the initial high-density urban area design scenario on multiple human perception categories. In this Grasshopper-based CUD tool, a Pareto-front, consisting of many different design scenarios, is generated as an outcome of the multi-objective optimization. Any design output resulting in the Pareto-front cannot be further improved on one human perception category score without decreasing the human perception score of another category. Figure 12 presents a design scenario resulting from multi-objective optimization.

Fig. 12
figure 12

Pareto-front design scenario resulting from the multi-objective optimization

From the optimization results obtained with the test case, it can be concluded that the optimization methodology involves a high level of interaction between the design freedom, intended as the set of design variables and requirements that allow for the generation of alternative design scenarios, and the incorporated relations between the volumetric built environment and the different human perception categories in high-density urban areas. As a result, the output can be explained on the incorporated relationships but only while taking the design freedom into account. A more flexible design freedom is likely to allow for more optimized designs. As expected, the most dominant (design variable) volumetric built environment element that positively influences all the scores on all three human perception categories is the green strip width due to the fact that a wider green strip width results in more trees. This is, in accordance with the findings of the first part of the article, in which we found that greenery plays an important and positive role when it comes to human perception in high-density urban areas.

In summary, Table 5 provides an overview of the single-objective and multi-objective optimization results. Including the values of the most relevant design variables that define the resulting output designs.

Table 5 Results of single- and multi-objective optimization per most relevant design variables

5 Discussions and conclusion

Densification and the creation of high-density urban areas have been proposed as solutions to the increasing world population and diminishing earth resources. However, designing high-density urban areas comes with risks such as the possible negative impacts of high-density areas on human subjective well-being. Reducing such negative impacts requires involving potential end-users together with the other stakeholders (i.e.; area developer, local government) in the early stages of the design process which is a challenge. Several Computational Urban Design (CUD) approaches and tools already exist today that support urban designers in the process of generating several urban design alternatives (which in this article we call “scenarios”) from the very early stages. The reasons for the different scenarios are manifold. Each scenario may focus on or give particular relevance to certain aspects that are deemed important in certain circumstances or may be dependent on whether – and to which extent – the designer decides to respect certain constraints, for example, either in terms of urban regulations, environmental characteristics, social aspects, or economical goals. An advantage of creating several scenarios is the possibility of facilitating the following evaluation of the alternatives by the stakeholders and providing feedback that may lead to a further refinement of the design process. As a matter of fact, depending on the complexity of the project and the number of aspects to be considered (and thus formalized as design variables), the potential number of all possible design alternatives could lead to a sheer number of scenarios, de facto nullifying the original purpose of facilitating the understanding and feedback phases by the stakeholders. Therefore, optimization algorithms are often used in order to reduce the resulting number of scenarios in that either a single criterion or multiple criteria at the same time are optimized.

Such CUD tools are essential for navigating the complexities of the urban design process, especially considering the diverse needs of various stakeholders (including the end-users) while focusing on the enhancement of urban areas in terms of their liveability. However, involving end-users in the design generation and evaluation phases is not always an easy task which requires time, effort, and an inclusive approach. In that sense, big-data approaches (as in this paper) that contain human perception can support the involvement of end-users’ needs, by generalizing and quantifying them.

This article has explored the incorporation of human perception in CUD from the early stages of a high-density urban area design process, with the ultimate goal of generating design scenarios that can enhance the subjective well-being of end-users of a high-density urban area. In order to do so, the research work has been subdivided into two main steps. First, human perception aspects (such as perceived beauty, liveliness, and safety) in high-density urban areas, as the indicators of subjective well-being, have been investigated and quantified, leading to the identification of the most relevant urban characteristics (volumetric built environment elements) for creating beautiful, lively, and safe high-density urban areas. In the following step, the results and findings from the first part have been implemented in a CUD tool and finally tested on a hypothetical high-density urban area setting.

The first part of the research has covered a literature review on the relationship between human perception and the built environment, focusing on perceived beauty, liveliness, and safety in high-density urban areas. Using a big-data approach with street view imagery data, respondents’ choices of street view imagery for their perception of safety, beauty and liveliness, and by utilizing the open built-environment data, a multinomial logit analysis (MNL) was applied to quantify the relationships between volumetric built-environment elements and human perception in high-density urban areas. The analysis results reveal the significant influence of trees, building composition, building height, and street width on human perception.

For the second part of the study, a Grasshopper-based CUD tool has been developed, based on a generative design component built on top of a parametric urban-design component and resulting in an overall CUD tool. This tool allows an urban designer to have optimized high-density urban area designs based on human perception by incorporating the relationships between volumetric built environment elements and human perception found in the first part of the study. The incorporation of clear functions describing the relation between human perception and the volumetric built environment characteristics in high-density urban areas is therefore essential for computational urban design.

The main contribution of this study is therefore to highlight the importance of addressing human subjective well-being in high-density urban area design through the incorporation of human perception, ultimately extending the possibilities of computational urban design to support the urban design and development process and including these aspects from the beginning in the initial discussion/evaluation process between designers, developers and potential end-users of an area. To the authors’ knowledge, no similar examples of CUD tools, especially that follow an empirical approach to incorporating human perception, therefore providing evidence-based design solutions, could be found in the existing literature.

In particular, when it comes to the first part of the study, the research outcomes underline the importance of considering the interplay between volumetric built environment elements and human perception in high-density urban areas. Since we aimed to understand the relationship between the dependent and independent variables, the MNL model has proven suitable to highlight dominant attributes and to facilitate their incorporation into computational urban design. However, the model fits were low, especially for perceived safety and liveliness. These results indicate that, when the relationships between the volumetric built environment elements and human perception were considered without any control variable (i.e., personal characteristics), this relationship was relatively weak. Therefore, volumetric built environment data (coming from street view imagery and open spatial datasets) should be complemented with additional datasets obtained through surveys or alternative methods, such as virtual environments. This holistic approach would be able to reveal more robust relationships. Additionally, results suggest studying the relationship between human perception and the built environment for specific urban typologies and gathering socio-demographic information from respondents to optimize designs for target groups. The inclusion of more detailed volumetric greenery elements, such as tree height and width, is also recommended, along with additional attributes describing variations in building shapes and a more detailed description of street typology.

When it comes to the second part of the work, the developed CUD tool, results and experiences collected so far have provided several recommendations for further improvement when incorporating human perception into computational urban design of high-density areas. For example, some of the current limitations should be overcome, e.g. by allowing a building to have a different offset from the street depending on its height, or by allowing the footprint of each building to be set independently. Secondly, there is still room for improvement in the pace of the design generation and analysis process. The current optimization tool generally requires several hours to find an optimal design and, since a fast optimization process is preferred in relation to its potential use in practice, the time of one design generation and analysis run should be minimized in order to lower the cost, expressed in time, of increasing the design freedom. However, this might itself lead to some disadvantages, as the amount of design freedom is directly related to the computation time and the number of resulting scenarios to be explored.

Finally, looking further into future possible improvements, an ideal CUD tool should also incorporate the perception of a specific target group for a specific area or urban environment. Additionally, in order to increase its supportive function to incorporate human perception from the preliminary design process, the descriptive power of the incorporated relationships could be increased. For example, in addition to using a big-data approach and the MNL analysis, other methods could also be added, such as machine learning and investigating perceived urban environments under experimental conditions. Furthermore, presenting generated urban designs in (immersive) VR settings can allow for testing how different groups of people would experience them (safe, lively, beautiful). This can also be a step forward in the human representation within urban digital twins.