1 Introduction

Geographic space in a city refers to the public space where people can freely access. One of the most commonly utilized approaches for exploring spatial morphology at the scales of buildings, communities, and urban settings is space syntax. Hillier and his colleagues (Hillier & Hanson, 1984; Hillier et al., 1976) conjecture that space is the logic of human activities. They proposed space syntax method to decompose the continuous geographic space into a set of unique axial lines, based on which a spatial network can be constructed to analyze the spatial morphology and its correlations with human mobility. Space syntax has been widely applied to explore urban morphology and urban design in different fields. The success of space syntax and network science in urban studies provides fundamental theories and techniques to explore urban morphology in 2D space. However, geographic space is 3D in nature, whereas most of the related studies are constrained in 2D space, partially due to the limited data access to the human movement data in 3D space and lack of methodologies.

Modeling urban morphology in 3D space can shed new light in this field, particularly in a dense "vertical" urban environment like Hong Kong. Nevertheless, two fundamental questions arise when applying space syntax in 3D space: Is the core method (i.e., axial line) applicable in 3D space, and how does the 3D spatial morphology fit into existing geographic and complex network theories in 2D space or vice versa? Fortunately, the advances in information and communication technology provide opportunities to monitor and sense human movement with unprecedented spatiotemporal resolutions in 3D space. It thus brings challenges and opportunities to the relation between spatial morphology and human mobility. How to represent the 3D spatial morphology in a densely built complex environment? And what aspects of features in such morphology relate to human mobility?

Motivated by the above knowledge gaps and the literature on multilayer networks in network science (Kivelä et al., 2014), the aim of the paper is to predict the route-based pedestrian flow of a multilevel campus in Hong Kong as a function of spatial configuration using space syntax analytics in 3D context. In principle, the aim of estimation/prediction is to use the best predictors to achieve the best estimation/prediction results. This paper extends the traditional space syntax axial model in constructing a multilayered axial network of a campus to explore multilevel spatial morphology. To estimate the route-based pedestrian flow in the multilayered spatial network, over 2,200 Wi-Fi access points (APs) with georeferenced locations in a multilevel building are used to infer user locations and routes between origin and destination. When a user connects or disconnects with a georeferenced AP, the user's location and stay time can be estimated. When a user moves around and connects to different APs, the trajectory among consecutive APs can be inferred in the multilayered spatial network. In this regard, a finer-grained chain of human movements can be obtained from the AP connection data and the flow of people on each axial line can be estimated. Multiple variable regression analysis is then applied to assess the correlation coefficient of human flow and the multilayered spatial network attributes. To further explore spatial morphology, community detection is conducted on the multilayered spatial network. The empirical case study provides evidence that the extension of space syntax can work well in modeling 3D morphology using a multilayered spatial network representation, based on which the regression model better predicts human mobility.

2 Related works

In the past decades, human–environment interaction has received considerable attention from a variety of disciplines, and it is agreed that the built environment and human activities are associated with each other. Some early studies focused on identifying the static layout of the urban environment. For example, Weber (1958) proposed three classic models of urban structures (i.e., concentric zone model, sector model and multiple nuclei model). With the advances in urban studies, Hägerstrand (1970) suggests that people are also of "scientific relevance" to regional and urban science and proposes the framework of time geography to facilitate research on human–environment interactions. To further advance this topic, this section thereby reviews two main bodies of literature: the first one presents the effectiveness of space syntax methodology in representing space and the relationship between spatial entities, and the second section introduces how the space syntax representation and its feature can help comprehend the human–environment interaction using empirical human activity data.

2.1 Space syntax representation

Hillier, Hanson and their colleagues (Hillier & Hanson, 1984; Hillier et al., 1976) at The Bartlett, University College London propose the concept of "space syntax" as a set of theories and techniques to help urban planners simulate the likely social effects of their designs. For instance, to understand the correlation between human movement flow and the spatial structure of urban environment. Methodologically, space syntax breaks down the large and continuous urban space into a set of unique axial lines representing each convex space. Based on the axial line representation, a spatial network can be constructed to analyze the spatial structure and its relationship with human movement flow, borrowing methods from network science.

In the past decades, space syntax has attracted significant attention from researchers in different fields (Please refer to Supplementary Material section 1 and 2 for more details). Here we mainly highlight the research gaps for studying human–environment interaction using space syntax. First, traditional space syntax research puts much emphasis on the spatial and geometric structure when representing space (Ratti, 2004), for example, axial lines based on visibility in which human movements might be ignored. The visibility-based rule is more useful for open space; however, it doesn't match the scenarios in complex urban environments where human movements are influenced by many factors, such as in densely built cities or buildings on a university campus. Although Liu and his colleague proposed a method for redefining and generating axial lines by considering the walkability and drivability of streets (Liu & Jiang, 2012), the accessibility-based space syntax model is still limited examined using empirical human activity data. Second, the refined theoretical basis was first proposed in the context of 2D urban space, which has not considered the scenario in 3D space. Specifically, how to represent the 'walkable' space in 3D indoor context in a unified model is challenging and lacks empirical validation. This paper thereby tries to address the abovementioned two gaps to further study the accessibility-based space syntax model in a real scenario and adapt it in 3D context in a university campus where the 'walkable space' highly depends on routes, staircase, escalator, and elevator. The proposed model considered the above indoor elements and validated by detailed human activities data will be presented in the following sections.

2.2 Space syntax analysis of pedestrian movement

There are a number of empirical studies using space syntax analysis to study aggregate pedestrian movement. For example, Hillier, in his early study found that pedestrian flows can be predicted using the axial representation in calculating integration or closeness centrality in the network science literature (Hillier, 1996; Hillier et al., 1993). Following this line of work, Hillier and Iida (2005) found the use of segment angular analysis performed better than metric and topological costs when correlating with the movement for three urban areas in London. Chang and Penn (1998) found that the prediction of passenger flows using conventional space syntax analysis is poor. So, they developed a model called "IMCM" by considering additional parameters besides the conventional space syntax parameters, and they found that their model is able to predict the passenger flow in the Barbican and the South Bank complexes in London. Zhang et al. (2012) analyzed the correlation between pedestrian flow and space configuration in a complex commercial building, including multilevel entrances and common multilevel areas. Furthermore, Greenberg et al. (2020) proposed a new model which generates a 3D segment map that represents the topographic environment with considering the visibility and physical effort.

To explore space syntax in 3D space, some scholars have tried to extend the axial model of spatial syntax into the 3D space from the perspectives of geometry, topology and analytics (Asami et al., 2003; Ascensão et al., 2019; D'autilia & Spada, 2018; Ji & Liu, 2017; Kim et al., 2019; Mackaness et al., 2007; Wang et al., 2007; Zhang & Chiaradia, 2019; Zheng et al., 2010). For example, Zhang and Chiaradia constructed a 3D spatial network model of Central in Hong Kong, considering the 3D geometry of public and private space when analyzing pedestrian movement (Zhang & Chiaradia, 2019). Specifically, they study correlations between an axial map, an axial segment map, a path-center line outdoor map and a path-center line indoor-outdoor map. The research found 3D angular analysis with the path-center line indoor-outdoor map performs better than the axial topological map and the segment Euclidean and angular map. The research used a simple linear regression model to validate the association. However, this research did not consider other centrality measures from network science.

A key objective of this research is to estimate route-based pedestrian flow on a university campus. As such, research papers examining pedestrian behavior on university campus environment were reviewed. For example, Sun et al. conducted a comparative analysis to identify the difference between the perception and reality of walking using the walking accessibility method, which is based on combining the combination of gravity-based accessibility and cumulative opportunities accessibility measure (Sun et al., 2015). In the longitudinal study (Sun et al., 2014), the authors found that changes in the university's built environment had a significant impact on walking behavior. However, few studies have utilized Wi-Fi log data to profile the detailed human flows across the 3D space of campus where the spatial relations are more complex than 2D space.

Motivated by multilayer networks in network science (Kivelä et al., 2014), its application in the built environment is still in its infancy. A reason for this could be attributed to the limited access to movement data in complex buildings and computing capacity. Our research differs from previous research in two fundamental ways: a) the use of Wi-Fi data in capturing pedestrian flow automatically and b) the use of different complex network measures. To end, the novelty of our research is the combined usage of space syntax analysis, multilayer network science and Wi-Fi data in a campus environment. However, we would also like to acknowledge the current study does not apply 3D angular analysis, which we hope to incorporate in the future to characterize the morphology of the campus.

3 Methodology

This section introduces the method of space syntax (Please refer to Supplementary Material section 2 for more details) and illustrates the extension of space syntax to model spatial morphology in 3D space. First, the 3D spatial morphology is represented using a multilayered axial network in space syntax. Second, the 3D spatial morphology is characterized by applying several space syntax metrics and network centrality measures on the constructed multilayered spatial network. Third, the route-based pedestrian flow in 3D space is estimated using Wi-Fi log data, and fourth, a multiple variable linear regression approach is applied to analyze the correlations between the pedestrian flow and the multilayered spatial structure. Last, a community detection algorithm is applied to identify the 3D functional areas. The metrics calculation and visualization were performed by open-source Python codes developed internally, ArcGIS and ArcScene packages. Furthermore, the statistical analysis was performed using the SPSS software.

3.1 Modeling spatial morphology using space syntax

A campus includes both indoor 3D space and outdoor 2D space. The latter can be regarded as the normal 2D space for the generation of axial lines. The former space in 3D buildings can be regarded as 2D plain floors and the connection space shaped by the vertical physical infrastructures (e.g., elevators, stairs, escalators) (Fig. 1-(b)). All axial lines of floors and connection space are drawn manually in ArcGIS and ArcScene. The essential rule used for creating axial lines in 3D campus is based on "accessibility" instead of "visibility". Note that the rule means that, despite the visible areas in 3D space, the physical connections among them will be considered in generating axial lines. It addresses the drawback of classic space syntax rule greatly depending on geometric structure and visibility, which will ignore the environment-constrained human activity. The complex constraints, particularly in the 3D campus context, are the transitions within the intra-floor and across the inter-floor.

Fig. 1
figure 1

Modeling "connection space" to extend space syntax in a multilayered space: a Example of connection space where A2 and B1 are the 'U-turn' staircase across floors. b 3D axial maps model. Connection information of all axial lines is validated by manpower and imported into our C# program to generate the graph model

Based on existing AutoCAD campus map data, researchers and student helpers walk through all the routes on campus to validate the connection matrix of intra- and inter-floors. For example, the connection between two floors by elevator is a perfectly vertical line, while an escalator produces a line with 3D angle, and the staircase with a U-turn results in two axial lines with angles. Using the same validation method, the connections of axial lines are checked to agree with the reality in the 3D campus environment. Then the connected axial lines of both 2D and 3D spaces are imported into a programming environment to generate the graph and calculate network metrics. The tool for constructing graph models from the 3D axial lines is developed internally using C-shape language (i.e., C#). The space syntax metrics are calculated by depthmapX software.

The overall process is illustrated by examples in Fig. 1. For example, the elevators in Fig. 1-(a) are represented as one straight axial line linking different floors. Notably, some floors are not accessible for some elevators, where the links will not be created. The stairs and escalators in Fig. 1-(b) are represented as axial lines based on their directions and accessibility. Due to the small size of Hong Kong as well as the small area of The Hong Kong Polytechnic University (PolyU) campus of up to 9.46 hectares, the PolyU has a limited number of large fully walkable open spaces, either with a longitudinal extension from one direction (i.e., rectangular space) or with a span from two directions (i.e., squared space or atrium) and the total number of these large open spaces is around six spaces. In addition, the PolyU campus has covered walkways, and it should be noted that most of these large, fully walkable open spaces are rectangular in shape, as their length is more than twice their width (i.e., they resemble a street or long corridor). For example, the largest fully walkable open space has a maximum width of 20 m and a length that is five times its width. In rectangular spaces, the major axis corresponds to the longest length, and the minor axial lines arise at the transverse corridor and at the entrances to the side buildings (see Figure S7). There are two large walkable squared spaces, one of which has a roundabout in the middle. As stated in Supplementary Material Section 2, the axial lines are the least number of the longest lines in space. Therefore, for the square space, the axial lines correspond to the paths between the cores of the PolyU building, which are the longest lines (see Figure S7). Figure S7 shows the axial map of the podium level.

3.2 Characterizing spatial network structure

After generating the axial lines in 3D space, a 3D dual network can be obtained as an undirected dual graph G = (N, E) where axial lines is a set of nodes \(n\in N\) and the intersection or connection between them as a set of edges \(e\in E\). A series of structural metrics are introduced to characterize the network in this section. In the theory of space syntax, the frequently used indicators for axial lines include connectivity, control, local and mean depth, local and global integration (Jiang & Claramunt, 2002). These centrality parameters reflect the importance of nodes in a network, and have been widely adopted in many studies (Rui & Ban, 2014; Sarkar et al., 2019; Senousi et al., 2020; Wang et al., 2011; Zhao et al., 2017) (please see Supplementary Material section 3 for more details). In addition, four other complex network centrality measures are also used to characterize the 3D spatial dual graph, i.e., betweenness, closeness, eigenvector, and PageRank. Betweenness centrality is commonly referred to as the choice in angular segment analysis but is not often used in axial line analysis. For the convenience of readers, some of the concepts and formulas are introduced below.

The degree of an axial line means the number of lines that directly intersect with the current axial line. The control value literately refers to how an axial line "controls'" its immediate neighbors, which can be calculated by the sum of the inverse degree values of its immediate "neighbor axial line". Regarding the depth of an axial line, it can be calculated by the number of lines distant from a given number of steps to the current axial line. Regarding the notion of integration, it reflects how a node is integrated with other nodes (both locally and globally). The higher the integration, the more it is integrated. The radius used to calculate space syntax metrics is set as 2.

$$\sum\nolimits_{k=1}^sn_i=\left\{\begin{array}{c}degree:s=1\\local\;depth:d>s>1\\total\;depth:d=s\end{array}\right.$$
(1)

where s is the steps of current axial line, nk is the number of neighbors at step s, d is the network diameter (i.e., the maximum topological shortest path between any pair of two nodes).

$$Control={\sum }_{k=0}^{m}1/{n}_{k}$$
(2)

where m is the number of neighbors, nk is the number of neighbors of the current neighbor.

$${MD}_i={Total\;depth}_i/(n-1)$$
(3)
$$Local\;{Integtion}_i=\frac1{RRA_i},RA_i=\frac{2\left({MD}_i-1\right)}{\left(n-2\right)}$$
(4)

where \({MD}_{i}\) is the mean depth of node i, \(R{A}_{i}\) divides \({D}_{k}\) then we obtain the Real Relative Asymmetry (\(RR{A}_{i}\)), \({D}_{k}\) is the Diamond Value of graph with k axial lines.

Four network centrality measures are selected to quantify the structural characteristics of the network. Betweenness is defined as the number of shortest paths between any pair of nodes that pass through a certain node. Normally, the nodes acting as bridges in the shortest paths tend to have high values of betweenness. The betweenness B of node i in network can be calculated as the following formula:

$${B}_{i} = {\sum }_{s\ne t\ne i}Path\left(s,i,t\right)/Path\left(s,t\right)$$
(5)

Closeness is defined as the inverse of the total graph-theoretic distance of a given node from all the other nodes in the network, which measures how close a node is to all other nodes along the shortest paths. Given a node i in the network, its closeness can be expressed as:

$${C}_{i} = \frac{1}{\sum_{i\ne j}{d}_{ij}}$$
(6)

where \(d_{ij}\) represents the graph-theoretic distance of node i and node j.

As an alternative centrality measure, eigenvector centrality considers both the number of connections of a given node and its relevance in the form of scores. Relative scores are assigned to all nodes in the network. The assignation principle is that connections to high-scoring nodes contribute more than equal connections to low-scoring nodes. The eigenvector centrality E measures how well connected a node i is to other nodes in the network, which can be written as:

$${E}_{i} = \frac{1}{\lambda } \sum_{j=1}^{n}{A}_{ij}{E}_{j}$$
(7)

where \(E_{i}\) is the eigenvector of node i associated to the eigenvalue \(\lambda\) of \(n\times n\) adjacent matrix A.

PageRank is a variant of eigenvector centrality, which is originally introduced to rank web pages by Google. In graph theory, PageRank computes a normalized and propagated value for each node in the network recursively.

3.3 Pedestrian flow estimation and correlation analysis

This section illustrates the estimation of the route-based pedestrian flow in the 3D axial network and the correlation analysis of pedestrian flow and network features.

3.3.1 Estimation of pedestrian flow

Unlike mobile phone signaling data, the Wi-Fi AP data cannot be collected at a regular time pace as the smartphone nowadays will automatically disconnect with AP for privacy protection and battery saving purposes. Despite the wide distribution of Wi-Fi access points and a high percentage of smartphone users on campus, it is necessary to infer the flow routes to improve the trajectory completeness before calculating the number of pedestrian flows (Jiang et al., 2023; Forghani et al., 2020; Bonnetain et al., 2021; Zhou et al., 2021). It should be noted that the pedestrian flows hereby is not the number of connections directly joined to the sensing devices like did in Meneses and Moreira (2012) and Ding et al. (2019), the flow in this study is named as route-based pedestrian flows. There are mainly three steps for flow estimation: 1) matching each connected AP to the closest axial line; 2) inferring the visited route between pairs of APs throughout a day for each user; 3) adding up the number of visits of all users in 7 days for each axial lines as the pedestrian flow number. The techniques used for AP matching and route inference are explained below with examples.

As shown in Figure S2, a spatial network with five axial lines (L1-L5) have multiple potential routes between the two georeferenced APs. This figure represents both 2D and 3D context, as the location information of AP can include both coordinates within a floor and the floor index (Table 1). After matching all APs to the closest axial line, the chronological sequence of AP connections (Table 2) can be projected as the movement trajectories along different axial lines (namely across multilayered buildings). When a user moves in the study area, if he or she enables the Wi-Fi function of his or her smartphone, the Wi-Fi connection information will be automatically recorded (Table 2). For example, when the user connects to AP1 at time T1, his or her approximate location will be represented by the location of the AP1. To define the closest AP and avoid overlap within and between floors, we assign the AP that has the highest RSSI value and is larger than the RSSI threshold. When the user moves from AP1 to AP2 at time T2, the shortest paths between the origin and destination can be calculated using the Dijkstra algorithm. While in this case, there are three shortest paths: dashed path in green, dotted path in red and dash-dot path in blue. These three paths are all the shortest in terms of geometric length. Obviously, there are three "turns" in the dotted path in red and dash-dot path in blue, whereas there is only one "turn" in the dashed path in green. In this study, we choose the dashed path in green as the shortest path between AP1 and AP2, because it is geometrically and topologically shortest at the same time. The dashed path in green goes through two lines: L3 and L4. The pedestrian flow of the two lines is added by one, respectively. In this way, the route-based pedestrian flow of all 3D axial lines can be estimated.

Table 1 Access point attributes
Table 2 Wi-Fi log data

There is no doubt that there are factors that influence pedestrians in their route choice, with one school assuming that the shortest routes are preferred in route choice and another believing that the topological factor (i.e., fewest turns) is dominant in route choice. In previous literature, there are studies that aim to resolve the debate on pedestrian route choice behavior, such as (Shatu et al., 2019). They conducted an empirical analysis using route choice data from 178 students and concluded that the least directional change is the preferred option in pedestrians' route choice decisions. The above experimental study was conducted in urban streets (i.e., city-level) instead of a university campus. Although the surrounding environment influences pedestrians' route choice decisions, fortunately, the participants in this experiment are, however, students. We make a simple assumption by adopting the topological perspective for estimating route-based pedestrian flow, inspired by the experimental study (Shatu et al., 2019). Undoubtedly, the choice of flow estimation can be influenced if different perspectives/models of route choice are used. For example, Sevtsuk and Kalvo (2020) conducted a comparison of pedestrian flow prediction efficiency associated with 5 different route choice models, namely "shortest path", "equal probability", "distance-weighted probability", "utility-weighted probability", and "highest utility". However, any model or perspective applied in determining pedestrian flow has its limitations, with the inherent uncertainty in different route choice models, as pedestrian route choice decisions obviously vary from person to person and from time to time. The focus of this work is not to compare influences of different route choice models but to investigate the relationship between human movements and 3D morphology. While it should be mentioned that the obtained relationship cannot be indicative for universal scenarios, and mainly significant in the case of routes and flow estimated by Dijkstra algorithm.

Noticeably, to protect user privacy, all of the personal information is removed. The mac address of smart device is used to represent user ID, which is also masked to protect user privacy. Moreover, none of the individual data will be used in this work. The calculation and analysis are at the level of use aggregation.

3.3.2 Multiple linear regression analysis

In classical space syntax studies, it is agreed that space syntax "presents a deterministic model of space use", and "social activity (and subsequently user spatial interaction) is then calculated as a function of spatial characteristics" (Cheliotis, 2020). The pedestrian flow in the 3D spatial network represents the space use of people in 3D space to some extent, and thus the spatial morphology of the 3D space will be able to predict the pedestrian flow. Therefore, it is expected that the coefficient of determination (R2) between spatial structure and pedestrian flow can reflect the such relationship. Generally, the coefficient of determination is calculated by using the pedestrian flow and a single network centrality parameter, such as local integration.

In nature, the spatial network is a kind of complex network, of which there are many other network parameters, such as betweenness centrality and PageRank. Existing studies show that a set of combined network parameters, including topological and geometrical measures, perform better than a single parameter in the analysis of the relationship of network and human or traffic mobility (Senousi et al., 2020; Zhao et al., 2017). To better explore the predictability of the multilayered spatial structure in 3D space, a multiple linear regression method is used to improve pedestrian flow prediction (Pun et al., 2019). The main focus of this work is to examine how the multilayered space syntax-based 3D spatial structure can predict the route-based pedestrian flow. Compared with a bivariate simple linear regression model, the dependent variable, such as human flow, is highly likely to be associated with more than one factor.

To conduct multiple linear regression analysis, space syntax parameters (Connectivity, Control, Mean Depth, Global Integration, Local Integration, and Local Depth) and four other frequently used network centrality metrics (Eigenvector, Closeness, Betweenness, and PageRank) and geometric length of the axial lines are selected to quantify the characteristics of axial line-based 3D spatial network. For the detailed calculation of traditional space syntax parameters and other parameters, please refer to Characterizing spatial network structure section and related work (Jiang & Claramunt, 2002; Senousi et al., 2020; Zhao et al., 2017).

4 Results and discussion

A university campus in Hong Kong is selected as the study due to three main reasons: 1) extensive coverage of WLAN, 2) the high percentage of active Wi-Fi users, and 3) the highly densified vertical campus environment. A university campus is an interesting testbed for a study of 3D spatial structure and human behavior because a large group of people is inter-connected via many shared activities such as lecture attendance, lab testing, exam sitting, project work and sporting events, all situated within the campus environment, horizontally and vertically. This work will track dynamic human flows (i.e., via route estimation) to evaluate how 3D spatial structure interacts with human movements.

4.1 Study area and data processing

As mentioned above, a university campus is selected as the study area in 3D map (Figure S3-(a)). The original map data is stored in AutoCAD format, and the total size is about 165 megabytes. The data of one building is stored in a single folder, and each folder contains the layouts (Figure S3-(b)) of all floors of the building. The entire map is converted into a unified format and projection system for 3D modeling. Using the methods mentioned in Modeling spatial morphology using space syntax section, the multilevel campus space is transformed into axial maps that are used to construct the multilayer graph model, in which axial lines are the nodes, and connections between axial lines in 3D context donate the graph edges.

As introduced in Characterizing spatial network structure section, we obtained the human flow of each axial line using Wi-Fi log data in consecutive seven days from March 4 to 10, 2019, which is a normal week. There are around 35,000 users in total, including all students and staff. The route-based pedestrian flow of all axial lines is calculated day by day, and then accumulated as total flow. As shown in Fig. 2, the total pedestrian flow presents a significant heterogeneity over campus space, with dense flow concentration on the university podium, library, main streets, and classrooms near the floor transition infrastructures (e.g., elevator). This 3D map demonstrates that the Wi-Fi log data can effectively capture the human flow in very high resolution, which enables the morphology analysis using space syntax to cover the whole studied space. Traditional flow observation methods, such as traffic surveys or manual counting, however, can normally be adopted to a limited number of selective locations. Taking closer look at the podium level (Figure S7-(c)), it is clear that the axial lines closest to each entrance/exit have the highest pedestrian flow. Whilst the PolyU campus has multiple entrances/exits, where their visual patterns of flow-betweenness relationship present obversion differences. This is mainly due to the morphology metrics in the 3D context can be easily interpreted from the visual patterns of a 2D layer (i.e., podium level). The 3D multilayered environment could have an integrated effect on human behavior. Therefore, we further quantify the characteristics of human flow, spatial morphology, and their relationship in the below sections.

Fig. 2
figure 2

Estimated route-based pedestrian flow of axial lines in 3D space. Note that red means the high volume of the flow, and blue means the low volume of flow

4.2 Spatial network analysis

4.2.1 Scaling property analysis

As introduced in Characterizing spatial network structure section, the topological network centrality measures (e.g., betweenness, closeness, eigenvector, PageRank) and the traditional space syntax measures (e.g., degree, control, global integration, local integration, total depth, local depth, and mean depth) of the 3D spatial network are calculated. A campus with densely connected space in 3D dimension is very likely a complex system. We further explore the characteristics of this system in this section, particularly the scaling properties of its morphology features. The cumulative probability distributions are inferred on two space syntax metrics (i.e., degree and local integration) and two centrality measures (i.e., betweenness and eigenvector).

As shown in Fig. 3-(a) and (b), the probability distributions of degree and local integration follow an exponential distribution: \(Pr\left(D\right) ={1.782*e}^{-0.5*D}\) and \(Pr\left(LItg\right) = {1.346*e}^{-0.77*LItg}\), respectively. As we can see from the insets of Fig. 3-(a) and (b), the semi-log distributions approximately follow the linear distribution. The distributions indicate the scaling property of a complex network. As shown in Fig. 3-(c) and (d), it is reported that the probability distributions of betweenness and eigenvector are consistent with the power law distribution with an exponential cutoff: \(\mathrm{Pr}\left(B\right) = {0.049*B}^{-0.34}{e}^{-36.34*B}\) and \(\mathrm{Pr}\left(E\right) = {0.008*E}^{-0.36}{e}^{-6.1*E}\), respectively. The heavy-tailed distributions indicate the scaling property of complex network and reflect the imbalanced phenomena: a high percentage (about 80%) of the lines in the 3D spatial network have centrality measures less than the average value, while a small percentage (about 20%) of the lines have centrality measures greater than the average. Even though the scale-free property has been detected in various real-world complex network extensively, the exponential distribution has also been found in many real work network, which attributes to different mechanisms of complex networks (Deng et al., 2011).

Fig. 3
figure 3

Cumulative distribution of four measures of the space syntax and network centrality: (a) degree, (b) local integration, (c) betweenness, and (d) eigenvector

It can be observed that the two space syntax metrics and the two other centrality measures display different distributions. It is reasonable to assume that a distinct relationship exists between two types of metrics. For instance, despite some lines having high degree values, like the elevators, they may have low betweenness due to the 3D environment. It could be the difference between 2 and 3D spatial morphology. Although space syntax has been used to explore campus morphology (Lo et al., 2015; Li & Dewancker, 2019), most studies have not examined the scaling properties in 3D graph. Our results fill the evidence void and prove that 3D campus space is a typical complex system in terms of its outdoor-indoor morphology.

4.2.2 Spatial morphology and community detection in 3D space

The heavy-tailed distribution of morphology metrics indicates that few locations serve as functional areas connecting others, while many other locations are relatively 'deep' and 'separate'. In this part, we visualize the metric on axial maps to examine the spatial patterns related to the above statistical distribution. In Fig. 4-(a), a key space syntax metric, local integration, is rendered by red representing the most integrated lines, and blue the most segregated ones. The colored 3D axial map shows that, in a horizonal direction, the highest integration score is placed in the main street of the podium floor which connects university canteen, library, and both administrative and academic buildings. In vertical direction, it is reasonable to see that the elevator of each building is superior than staircase as the most integrated locations. While the more interesting findings are, looking at elevators only, some elevators are still more integrated than others, due to the intersections with horizontal floors are different. In simple words, elevators with connections to podium main street even have higher integration scores. This justifies the necessity of constructing 3D axial lines model in a densely built environment. Otherwise, the space syntax analysis of single floor may underestimate the structural features. The space syntax findings here in a university campus agree with the wider theoretical discussion and empirical analysis of multilayer network (Kivelä et al., 2014; Zhang et al., 2021) that address drawback of ignoring multi-dimensional connections.

Fig. 4
figure 4

Axial maps are visualized according to (a) local integration and (b) Community detection in the 3D space syntax network (7 detected communities are visualized by different colors)

The importance of considering the horizontal and the vertical as a whole can be further proven by using community detection of network science. This technique was used to detect self-constrained local structures (clustered entities) in a graph. Topological connections inside a cluster are significantly stronger than the connections outreaching to other clusters. The high percentage of clustering is also a key feature of complex network (Clauset et al., 2009) beside scaling properties. Employing the modularity optimization algorithm proposed in (Blondel et al., 2008) and also employed in (Law et al., 2013), the 3D complex network is seen to be divided into seven communities (Fig. 4-(b)).

In our case, communities mean groups of axial lines that are strongly interconnected. As shown in Fig. 4-(b), communities visualized by different colors are spatially clustered in several main university areas: library and clinics in red, administrative building in blue, canteen and sports facilities in green, etc. Previous studies used human perception to determine similar areas on campus (Li et al., 2019; Sun et al., 2015), which requires specified knowledge and extensive time. In contrast, adopting community detection on the space syntax graph hereby provides an efficient way to group axial lines with similar structures and practically help identify functional areas from the bottom-up perspective.

4.3 Analyzing the relationship between 3D campus morphology and route-based pedestrian flow

Table 3 summarizes the univariate regression between the 3D network variables and the route-based pedestrian flow. One of the key goals in space syntax analysis is to predict human movement in using spatial configuration metrics (Penn et al., 1998). The underlying assumption is that how spatial entities are arranged/configured will have an impact on human behavior. While as shown in the above sections, the complexity of spatial configuration can dramatically increase in the highly connected environment (e.g., 2D-3D buildings), in which network science metrics might be more representative. Therefore, the univariate regression here tests both the space syntax metric and network science metric one by one to examine which metrics can predict human flow better.

Table 3 Simple linear regression results of route-based pedestrian flow

The R values are the correlation coefficients between pedestrian flow and the different metrics, and R2 values are the goodness of fit for the univariate models. It is found that the local integration has a correlation coefficient of 0.50, whereas network metrics such as eigenvector centrality and betweenness centrality have a higher correlation coefficient (0.55 and 0.67, respectively) with route-based pedestrian flow. To some extent, this regression result is consistent with patterns in Figs. 2 and 3. Human flows on campus (and in many different spatial scales) present heavy-tailed patterns. The network metrics in this case, also present scaling properties, particularly stronger heavy-tailed distribution than traditional space syntax metrics. This is to say one metric is definitely better than others. While different metrics capture different aspects of spatial configuration. In the human mobility analysis, the results hint that morphological features represented by network metrics such as eigenvector and betweenness capture the spatial configuration that impacts more on the human flows. Whereas in the analysis of other human behavior, the traditional space syntax could be more representative.

Previous studies (Pun et al., 2019) found that including both network topological measures and geometric length can better estimate traffic flow using a multiple regression approach. In this work, six measures of space syntax (e.g., connectivity, control, global integration, local integration, mean depth and local depth), four topological measures of network centrality (e.g., eigenvector, betweenness, closeness and PageRank) and the geometric line length are selected as dependent variables in a multiple variable linear regression analysis. First, multi-collinearity of the independent variables is examined by computing the variance inflation factor (VIF) values of each variable in a full model. The variables with VIF less than ten are selected, including local depth, eigenvector centrality, betweenness centrality and line length. Then two models are constructed as a comparison: Model 1 only includes traditional space syntax metrics, and Model 2 include network science metrics as well.

The results of the multiple linear regression analysis are summarized in Table 4. Overall, the model 2 is better than model 1 in terms of predicting human flows, with a correlation coefficient R is 0.756 and an R2 is 0.571, which suggests that our 3D spatial network and pedestrian movement model can explain up to 57% of the real pedestrian flow in a complex built environment. Compared with simple linear regression, the goodness of fit is significantly higher than using single metrics, for example, the betweenness centrality and local integration (0.45 and 0.25, respectively), as reported in Table 3. Figure S6 shows that the integrated prediction model has more residuals close to 0, and Figure S5 presents a good linear relationship between the observed pedestrian flow and estimated pedestrian flow. These results suggest that multiple spatial measures (classical space syntax measures and complex network measures) can better estimate pedestrian flow than depending on sole aspect of metrics.

Table 4 Multiple linear regression of route-based pedestrian flow

In summary, the significance of the space syntax local mean depth (0.25) measures and line length confirm previous space syntax research where spatial configuration is a strong proxy for route-based pedestrian flow. While this study goes further to examine the role of network science metric in route-based predicting pedestrian flow in a 3D environment. Eigenvector centrality captures locally the nodes that are connected to the most important nodes, while topological betweenness centrality captures the paths between clusters, such as bridges between buildings. These network metrics can supplement space syntax metrics to enhance the representativeness of spatial configuration models that potentially perform better in depicting human behavior.

5 Conclusion

Spatial morphology in 3D space plays an important role in understanding its relationship with human mobility in a highly densified "vertical" built environment like Hong Kong. Most existing studies focus on 2D space due to limited data access and methods, while the increasingly available human sensing big data makes it feasible to explore 3D spatial morphology. This study constructs multilayered axial maps to represent the multilevel 3D campus space, on which a set of space syntax and network science metrics are calculated and characterized. Wi-Fi log data is used to infer the individuals' trajectory alongside the axial map routes, being aggregated as the estimation of pedestrian flows across space, and its relationship to complex 3D morphology is examined. The study contributes to the literature in several ways:

  • 1) Developing an effective way of estimating route-based pedestrian flows from large-scale human sensing data on campus: In space syntax studies, observing empirical human flows is an important step to further validate the relationship between spatial morphology and human behavior. Although human sensing data has been used to estimate human flows in intra-city, inter-city, and continental research (Barbosa et al., 2018), it is less reported in the building-scale space syntax analysis. Compared to traffic surveys and human-eye counting, Wi-Fi log data has better spatial coverage and can effectively capture the spatial distribution of human activity in high resolution (Fig. 2). Considering routes between the consecutive locations are significant when a study is conducted in a densely built environment and the movement sensing data is not collected in regular and frequent time paces. The proposed route-based flow estimation is useful for building-scale analysis because other popular sensing data, such as mobile phone and transport card data, is not available or does not perform well in indoor environments. The method used here provides a solution to convert discontinued log data to the continuous trajectories of flows. This method is applicable to built-up areas where Wi-Fi access points are widely distributed, such as campus, office buildings, shopping malls, and cities (Chen et al., 2022; Yan et al., 2022). This work is only built on the routes estimated by Dijkstra algorithm, while the framework opens up more opportunities in future research to further tune the route choices in indoor environment and investigate the passenger flows in different angles.

  • 2) Characterizing the complex nature of 3D morphology of university campus: Scaling properties are widely reported in physical, environmental, and social systems (Albert & Barabási, 2000; Alessandretti et al., 2020). While very few research has reported that spatial morphology of a densely built space, to what extent present the characteristics of complex system. Using university campus as the testbed, the space syntax metrics such as local integration together with network science metrics, such as eigenvector and betweenness, are found to be exponentially distributed. Because the connections from the vertical will increase the complexity of the prior connections within the horizontal like we observed on the highest level of integration score in elevators. A similar strong effect of vertical transitions is also reported by Zhang et al. (2012) using different approaches. This suggest that the increasing need of developing 3D model not only account for its 'realistic' visualization of space, but more importantly, the core idea of space syntax (i.e., spatial relations) can be represented with consideration of complex structure in the 3D environment. The suggestion on adopting multilayer network and metrics here is not limited to campus context, but future studies that involve complex morphology across dimensions.

  • 3) Proving that network science metrics can enhance the morphology model in predicting human flows: Space syntax analysis originally borrowed methods from graph theory, representing space and spatial relations as a graph. Recent network science metrics have been proven to be useful in capturing structural features in many complex systems, while it is less examined in the 3D morphology–human flow studies. This study examines the role of network science metrics by comparing them to conventional space syntax metrics in two ways. The univariate regression presents the differences of correlation to human flow metric by metric, which is useful for other 3D human flow studies to consider the effect of specific morphological metric. The multiple regression examines the overall influence of network metrics by comparing the goodness of fit of the space-syntax-only model with the space-syntax-network model. We found the latter performs better in predicting flows.

But still, there are some limitations in our case study. First, the radius used to calculate space syntax metrics is set as 2. Because the focus of this paper is to combine network science metrics with space syntax metrics framework, analysis of varying radius will make the whole framework more sophisticated. A future study dedicated to radius effect in the 3D campus context can be conducted. Second, different 3D spatial representations (Zhang & Chiaradia, 2019) and other environmental factors are not considered in our study; a comprehensive set of comparisons can be conducted in the future. Third, the route-based pedestrian flows are estimated by the Dijkstra algorithm, therefore the use of the obtained relationship between flows and morphology should consider the context of this specific route estimation technique. Whilst the framework of this study is not only applicable for the morphology studies in similar building scales but hints that if the complexity and connections dramatically increase in the studied space (e.g., 2D to 3D is one example), advanced multilayer network metrics should be examined.