Introduction

Built environment (BE) exerts considerable impact on accessibility, efficiency, productivity, well-being, and social structures. In terms of transportation, BE dictates peoples’ travel choices, destinations, travel time, etc. The McKinsey Global Institute (2010) [1] projected that 80% of India’s BE needs has to be operational by 2030, indicating an extreme thrust to create infrastructure based on the demand of 1.3 billion people. Currently, an average of 30% of the total urban trips [2] are through walking—which is greater than many of the cities in the developed nations [3]. However, most Indian cities do not have the necessary sidewalk/crosswalk infrastructure, and if such infrastructure is not planned in an integrated manner, Indian cities may face the same consequence as other developed nations [3], where walking share is low and the little infrastructure that is in place is left under-utilized.

Improving pedestrian BE has become all the more relevant owing to the COVID-19 pandemic which impacted our daily activities and presented the necessity for safe and sustainable transportation in cities. It was understood from the experience with the pandemic that physical distancing is an inseparable aspect of combating the virus, however locking down the entire economy is only a temporary solution. The larger balance lies in the strategic enforcement of physical distancing and allowing people to move through the city and access essential services. Cities, such as Bogota, Berlin and Milan, have responded to this crisis by transforming their streets for walking (and cycling) to assure safe mobility. By focusing on health and economic recovery at the core of interventions, such cities have reallocated streets for pedestrians, while allowing local businesses to reopen safely [4]. Moreover, improving the walking BE has economic connotations as well. It has been seen that reallocating streets for pedestrians, during this pandemic, have improved economy of local businesses in New York, USA [5]. Improving streets for pedestrians can increase retail sales of commercial establishment by 30%, as per a Transport for London report [6]. Therefore, walking as a mode of transport will play an important role in resuming urban life as cities emerge from the harsh effects created by the pandemic.

Indian cities have always had an organic and compact urban growth [7], which fosters smaller travel distances between destinations. Thus, proximity to destinations could be a plausible explanation for high pedestrian demand here. But contrary to this, walking share is steadily declining [8] and priority is being given to motorized travel experience by building cost-intensive flyovers, bridges, etc. [8, 9]. India’s inherent strength toward a walkable and multimodal urban transport environment, is being eroded due to negligence toward non-motorized transport. To arrest this detrimental trend, policy-makers were able to bring about complete street development initiatives in India through pedestrianization and revitalization of neighborhoods [4]. A few of these initiatives are already seen in Chandni Chowk’s (New Delhi, India) redevelopment plan [10], Church Street (Bengaluru, India) pedestrianization project [11], and the Aundh pedestrian trial (Pune, India) [12]. However, none of these projects have been employed on a network level and the design was implemented for a stretch of the street network. Interconnectedness is an important consideration especially for non-motorized modes [13]. Walking needs to be further encouraged, by not only providing infrastructure, but also, by integrating and connecting them in a coherent manner so that the whole infrastructure could function as a system.

With this notion in mind, this research aimed to carry out a connectivity analysis of three pedestrian networks of Varanasi, which is one of the oldest cities in India. The connectivity analysis would include the use of both, Geographic Information System (GIS)-based tools and Space-Syntax (SS) measures, to arrive at a potential connectivity indicator that helps understand pedestrian movement in the city through statistical analysis. SS is a branch of computational geography which deals with the spatial configuration of links and their interconnection between nodes in any physical network. The regression models examined in this study helped identify the most influential SS index, best representing the connectivity of walking networks in the city. Additionally, path models were developed as a part of this research to help investigate supplementary factors which influenced pedestrian movement in the city and its relationship to the network connectivity. Practical results derived from this study would assist city authorities of Varanasi to identify important and inter-connected street links which may require further design modification and walking environment improvement. Furthermore, the developed procedure for network connectivity analysis devised in this study also aims to provide a scientific method for identifying streets which may require retrofitting in any of the Tier-2 cities in India with an organic street network morphology.

Relationships between Pedestrian Movement, Built Environment and Space Syntax

The walking trip, be it utilitarian or recreational, has multiple benefits associated with it. For example, communities with higher walkability have higher levels of physical activity and therefore are healthier [14, 15]. Transit ridership increases with increased walk access to transit stops [13, 16, 17]. On the other hand, Filion et al. [18] assessed pedestrian movement in Toronto, Canada and pointed out that the impediments to walking includes inhospitable BE condition and automobile dependency of the residents.

Pedestrian travel in an urban area is likely influenced by—(1) level of service of existing built infrastructure; (2) regional characteristics (such as climate, topography, etc.), and (3) individual attributes (such as trip characteristics, preference, outlook, life style, etc.) [19,20,21]. Out of these, BE has been the most addressed topic in the existing literature [22,23,24,25]. This is because, the built walking environment can be easily modified or upgraded when compared with the regional characteristics or people’s individual attributes, which can rarely be altered in the short to medium-term.

Currently, an average of 30–50% of total urban trips in Indian cities are made on foot, which has however reduced by a decadal average of 5% over the past few years [8]. Although there is a lot of existing resources mentioning the walking conditions in Indian cities and conferred strategies for improvement, such as complete street design, green mobility initiatives, pedestrian-oriented development, etc. [9, 26,27,28], very few have worked on the connectivity aspects of a pedestrian network and reported critical findings from it. An important aspect of BE is the configuration/topology of pedestrian networks. This has been a burning topic for urban planners and researchers in the recent years, yet comparatively less addressed than other BE features. Even if urban links provide a conducive walking environment, users will refrain from using it, if it is poorly inter-connected in the network [29, 30]. Therefore, establishing connectivity of the pedestrian network is an important part of the built infrastructure.

The research community has been linking pedestrian movements with BE. Sharmin and Kamruzzaman [31] points out that there is a clear division of two groups with different school of thoughts when it comes to measuring connectivity. One group tries to explain pedestrian movement with the geographical dimension of BE (i.e., metric distance, landuse, infrastructure, etc.), while the other group uses topological aspects through Space-Syntax (SS) measures. Helbich et al. [32] studied both measures of BE and suggested that SS measures are equally or more important than the geographic approaches in explaining the relationship between BE and pedestrian movement. Since the inception of SS by Ben Hillier and his colleagues in 1970s [33], this technique has been used in a number of application. The most notable of which is the investigation of human behavior in discrete space. In particular, Bafna [34] described the technique as an investigation into the relationship between humans and spaces from the general perspective of inhabited space structures, such as buildings, settlement, cities or even landscapes. This is the reason why SS can also be applied to architectural layouts along with a number of other applications and builds upon the conversion of a continuous space into discrete quantities. As such, SS method emerges from the graph theory: the branch of discrete mathematics for the calculation of configurative spatial relationships between streets in the built environment [35]. With this research investigation into human movement behavior and space, Hiller and his colleagues converged onto an influential theory of the built environment, known as the ‘theory of natural movement’ [36] which showed that the spatial configuration of the street network influences the flow of human movement and the location of shops in the built environment, which was demonstrated with the use of SS technique.

Since this theory came into light, researchers and practitioners have been addressing the challenge of applying these tools to the planning practice and also as a means to understand the link between BE and pedestrian movement. This understanding is quantified by investigating a statistical relationship between indicators of pedestrian movement and one or multiple SS indices. These indices are computed by mathematical formulae, describing the relative position of links and nodes (or, rather their mathematical dual) in a network, and are specifically calculated for each link in a network graph. Some of the most used indices are– ‘integration’, ‘choice’, ‘connectivity’, etc. (elaborated later in “Data Sources”). Researchers [31, 36,37,38,39] found that ‘integration’ is closely and positively related to pedestrian volume, while Baran et al. [40] found that there is a positive association of total utilitarian walking with the SS index of ‘control’. Leo and Seo [41] showed that the SS index ‘connectivity’ had a weak correlation with pedestrian volume, whereas Peponis et al. [42] reported a strong correlation in their study of Atlanta, USA. Due to these inconsistencies, Sharmin and Kamruzzaman [31] conducted a meta-analytic review of these SS indices from existing studies to synthesize a generalized understanding regarding their magnitude and direction. It concluded that SS indices play a crucial role in explaining the pedestrian movement in an inter-connected BE, however the index has to be selected with caution.

This research work confines the BE assessment to the SS measurement of the pedestrian network. Existing resources on SS assessment and pedestrian movement have made it clear that there is a strong bearing of walking network connectivity on pedestrian volume [31, 36, 37, 39]. However, other researchers attribute the intent to walk (and consequently pedestrian movement), with the availability of walking infrastructure, such as presence of sidewalk [23, 24, 43,44,45]. This seems to be counter-intuitive for cities in the Indian context where the share of walking is higher than most cities in global north, despite the largely absent pedestrian infrastructure. A major portion of this can be accounted for the inherent compact nature of Indian cities, however, evidence of pedestrian movement’s association with the BE in such walking networks does not seem to exist in the SS literatures. Indian cities are perfect candidates for such an examination. Therefore, this research is geared toward understanding the pedestrian movement’s relationship in a walking network where sidewalk infrastructures are intermittently existent. The research attempts to answer the question, ‘Does network connectivity influence pedestrian movement in cities with little or no walking infrastructure?’

Research Framework

This section is divided into—(a) brief introduction into the case study area; (b) data sources used, and; (c) type of analysis conducted in the analysis. A pictorial representation of the framework is presented in Fig. 1. Study networks are prepared in a GIS environment along with the preparation of landuse maps. These maps are validated on-ground, along with the collection of pedestrian volume data. The SS indices are computed in a GIS environments from the geometry and spatial configuration of the walking networks. The analysis was conducted in two stages—firstly, correlation analysis was done to select only the most important SS indices; and secondly, a regression model was developed to ascertain the relationship between the predictor and the outcome variables. This helped in identifying predictor variables which displayed inconsistent effects on the outcome, thereby concluding that the former may not have a straightforward/direct relationship with the latter, which was ascertained using path models. The following subsections explains the research framework in details.

Fig. 1
figure 1

Research framework used in this study

Study Area

Varanasi is an old Indian city situated on the crescent left bank of the holy River Ganga in the state of Uttar Pradesh (UP) (Fig. 1). Currently, it is a Tier-2 city, with a recorded population of 1.4 Million in 2011 [46]. The reason for focusing on Tier-2 Indian cities are because Tier-1 cities are generally larger metropolitan cities (a state capital, etc.) where, despite a degraded and unmaintained built walking environment, infrastructure investment toward walking facilities is often evident. On the other hand, the focus is being shifted toward Tier-2 cities which have a larger scope for improvement compared to bigger cities. This fact has been confirmed by many researchers [47, 48]. Varanasi spreads over an area of 84.55 sq. km [49] comprising of 90 municipal wards having an average population density of 42,640 inhabitants per sq. km. It is also an important tourist attraction due to its rich historical and cultural lineage [46].

The movement within the city is a mix of two-wheelers (34%), auto-rickshaws (20%), bicycles (16%), walk (14%), four-wheelers (6%), cycle-rickshaws (6%), and others (4%) [50]. Here walking earns the fourth highest share among modes, much below the national average of 30%. As documented by R. P. B. Singh & Pal (2012), the present day Varanasi grew around its old urban core organically. As time progressed, the increasing number of visitors exerted stress on the environmental and basic infrastructural amenities [51]. This gave rise to a number of urban problems, out of which the most severe is the traffic congestion. This is majorly caused due to the movement of pedestrians on the vehicular Right-of-Way (ROW) [51]. In the absence of sidewalk infrastructure, pedestrians in Varanasi are the most vulnerable road users. There are a lot of existing studies addressing Varanasi’s urban problem, however, very few studies and reports exist on the transportation aspect [52, 53]. Due to this unavailability of secondary information sources, authors were obligated to carry out extensive field-work to aid the study.

The current study focusses on the newer urban areas in Varanasi because the older areas have a very distinct BE that is characteristic of historical/heritage cities in India [54] and including such areas would result in a study that would be difficult to replicate in other cities. To study the pedestrian network in the city, three representative locations, that are typical to any Tier-2 city in India, were identified based on their landuse character, presence of pedestrian activity, and availability of walking infrastructure. These locations (Fig. 2) were not in close proximity to the old urban core.

Fig. 2
figure 2

Varanasi Municipal Corporation area and study area locations

Data Sources

The study utilized multiple data sources and developed a GIS database and computed SS indices. The primary data from the study area were collected from multiple site visits to Varanasi, carried out in two different stages. The first stage was a reconnaissance survey in the month of April 2019, which was done to assess the sites preliminarily. Following this a detailed second survey was conducted in the month of May–June 2019 to examine the three study locations in-depth. The following sections explain the detailed data acquisition process.

GIS database creation. GIS is used as a database tool for managing the road network, plot-level building footprints, and landuse information. The road center line (RCL) network data were procured from Open Street Maps (OSM) (https://www.openstreetmap.org/). The network data were imported to the ArcGIS 10.3 environment and corrected for invalid geometry, duplicates, and overlaps so that they are topologically accurate for GIS operations. Additionally, panchromatic satellite images (2 m resolution, procured from the National Remote Sensing Centre, Hyderabad) and Google Earth was utilized for RCL digitization to fill in missing data from OSM RCL. The ROW was also recorded using Google Earth’s ‘measure distance’ tool. A hand-held Global Positioning System (GPS) unit was used to record streets within the study area that could not be identified from satellite imageries due to a small ROW (< 1.5 m).

The plot-level building footprints were digitized from the satellite imageries. For the landuse information, Google Maps and Bing Maps were referred and field visits were done. Landuse surveys from the field were conducted with the help of geo-tagged photographs. These photographs were useful for validating the generated landuse map and the pedestrian volume modeling analysis.

Online Mapping Platforms (OMPs) have recently started using crowd-sourcing to potentially identify local attractions. These were used to classify commercial establishments and public or semi-public institutions. Field visits were conducted to validate the information from OMPs and ascertain the residential areas in the study locations. The classification of landuse followed in this study are:

  • Residential—gated community to low rise building plots;

  • Commercial—retail shops (grocery, garments, etc.), eateries, offices, hotels, etc.;

  • Institutional—hospitals, schools, colleges, monuments, places of worship, etc.;

  • Open spaces—parks, green fields, tree cover, etc.;

  • Water bodies—rivers, lakes, etc.

To carry out SS analysis, the RCL data were simplified using the Douglas–Peucker algorithm. This is because Kolovou et al. [55] pointed out that SS analysis based on the simplified version of the RCL data is more accurate than the one based on raw data.

The landuse map for each of the three study area was able to provide the extent of coverage for each landuse category (in square meters) and links were coded as per the majority presence of a certain landuse. For example, a link is coded as ‘commercial’ if the square meter extent of commercial landuse category is the highest with respect to other categories listed above. This coding is important in identifying major land uses adjacent to links while carrying out the pedestrian volume count surveys.

Pedestrian Volume Count. The pedestrian volume count was performed on all the 297 links of the three study locations. A link is defined as the segment between two intersections. Dead-ends (like a cul-de-sac) were excluded as they do not serve in enhancing connectivity. On-spot count surveys were conducted by trained surveyors at near-midpoints of each of the links and 30-min pedestrian counts were extracted using click counters. While it is a recommended norm to capture pedestrian volume counts on 15 min duration [56], it was noted from the preliminary site visits that pedestrians in Varanasi exhibited a significant platooning behavior, particularly in the commercial and institutional areas. It has been observed that the mean speed of Indian pedestrians moving in a mixed traffic situation when moving in a platoon was considerably lower than instances when they move individually [57]. A lower walking speed would evidently mean a lower pedestrian flow, and subsequently the volume being observed within a fixed time frame. Therefore to smoothen out this variability in the pedestrian volume, the counts were carried out for double the recommended norm, i.e., 30 min. The surveys were conducted during peak hours (11 am to 1 pm) on three consecutive working days, and the average was calculated. Varanasi is devoid of an organized public transport and commuters rely on intermediate para transport services (such as auto-rickshaw, e-auto, etc.), which has no fixed schedule. Although the vehicular traffic peaks are observed in the morning and evening peaks, pedestrian traffic peaks are typically not observed and spread-out over the entire duration of a usual day. Moreover, the study locations selected were seen to be more active in between late-morning and afternoon hours (i.e., 11 am to 1 pm) on typical weekday, which may be due to the presence of commercial establishments and institutions. The counts were observed with no distinction for direction and followed the instruction mentioned by Al Sayed [58]. It was observed that the 30-min average pedestrian volumes varied between 46 and 2614 pedestrians (Table 1). Lower pedestrian volume (< 120 pedestrians per 30 min) was found on neighborhood-level streets adjacent with residential land uses. The proportion of links with low pedestrian volume roads was 48.3% in Location #1, 68.2% in Location #2, and 41.7% in Location #3.

Table 1 Thirty-minute pedestrian counts segregated as per landuse

Space-Syntax Indices. In the literature, SS techniques have been conceptualized based on two types of urban space representation—axial and segment. In axial representation, urban spaces are denoted by ‘axial lines’ which are the longest and the least set of straight lines enclosed within an urban enclosure (i.e., sight line /line of vision). Segment representations are created by splitting up the original axial lines at intersections into smaller individual parts [31, 36, 58]. Further, segment-based SS analyses could be sub-divided into three distance concepts—topological (no. of directional changes required to reach a destination from an origin), angular (sum of directional changes in degrees required to reach a destination from an origin), and metric (sum of segment lengths required to reach a destination from an origin). Sharmin and Kamruzzaman (2018) concluded that segment-based SS analyses using angular distances explains pedestrian movement in the most effective manner. Further, these researchers also suggested that out of the most utilized SS indices of ‘integration’, ‘choice’, ‘connectivity’, and ‘depth’; integration and choice have the strongest relationship with pedestrian volume.

This study focuses on segment-based spatial representation of the road network by applying the angular ‘integration’, ‘choice’, and ‘connectivity’ indices. The representation of angular distance used in this study was adopted from Turner (2005) and presented in Fig. 3. As per this figure, each turn angle is calculated w.r.t. the direction of movement, i.e., if a pedestrian walks from link A to B and then from B to C, turn angles x and y (w.r.t. B) are estimated by drawing an imaginary “forward” link (represented in dotted lines). The same procedure is followed while the pedestrian moves from link C to A through B.

Fig. 3
figure 3

Definition of angular distance adopted from Turner [59]

Table 2 shows 45 SS indices used in the study. These measures were obtained using the Space-Syntax Tool Kit (SSTK) plugin [60], which is a Depthmap X [61] version tailored to work within the QGIS 2.18 environment.

Table 2 Description of Space-Syntax indices used in this study

Statistical Analyses

To examine the relationship between the pedestrian volume and SS indices, different statistical analyses were conducted. This included correlation, regression, and path analysis. Each analysis is subsequently described in this section.

Since all the SS indices used in the study have a continuous scale of measurement, the Pearson’s correlation co-efficient is the best suited to test the relationship between SS indices and pedestrian volume. Given a set of ‘n’ pairs of observations (x1, y1), (x2, y2) … (xn, yn) the Pearson’s correlation co-efficient ‘r’ is given by Eq. 1.

$$r=\frac{\sum ({x}_{i}-\overline{x })({y}_{i}-\overline{y })}{\sqrt{\sum ({({x}_{i}-\overline{x })}^{2}\sum {({y}_{i}-\overline{y })}^{2}}}$$
(1)

where, \(\overline{x }, \overline{y }\) = mean values of x and y, respectively.

Ratner [62] suggested that an ‘r’ value between ± 0.50 to ± 1.00 depicted a high correlation between the two variables; ± 0.3 to 0.49 suggested a moderate association; and less than ± 0.3 depicted a weak association.

To further examine the relationship between pedestrian movement and SS indices, a detailed statistical regression analysis (and subsequently path analysis) was required. Based on the correlation values of the SS indices with pedestrian volume for all three locations, indices with weak or statistically insignificant correlations at 95% confidence intervals were removed from further considerations. Stepwise Multiple Linear Regression (MLR) was carried out through Ordinary Least Square (OLS) estimation, with pedestrian volume as the outcome variable. Since landuse and connectivity were important indicators of pedestrian movement [39], landuse (in sq. m area) was included along with the SS indices as explanatory variables. Other relevant variables, such as ROW (in meters) and presence of sidewalk, were also included in the model. Stepwise MLR was used for the modeling due to its ability in managing large amounts of potential predictor variables and fine-tuning the model to choose the best set of predictors from the available options [63]. Moreover, most studies in the SS literature employed MLR [36, 41, 64,65,66] for its ease in interpretation. Table 3 indicates the variables used in the investigation.

Table 3 Variables used in the modeling framework

While the stepwise MLR procedure would be instrumental in identifying the relevant SS indices best describing the network connectivity and its magnitude of influence on pedestrian movement, the technique is limited only to measure the quantum of influence and not the “chain” of impacts [67] on the outcome variable (i.e., pedestrian volume). This study employs stepwise MLR as a means to further screen down from the existing large body of SS indices. The actual complexity of the model will be better understood when these chains of influence created by the predictors on the outcome variable are viewed through the lens of causality. In fact, Streiner [67] asserts that path analysis is an extension of MLR that allows in examining more complicated relations among variables than having several independent variables predict one dependent variable. This allows us to examine various hypothesis regarding the chain of impacts and compare different models. While the intention for using the path model would be better explained in the subsequent sections “Stepwise MLR Results” and “Path Model Results”, it is worth noting that path models are useful for investigating indirect relationships. It is an advanced statistical method used for decomposing correlations to understand different direct and indirect effects [68]. This aids in the examination of variable relationships, simultaneously. The analytical form of the model is shown in Eq. (2).

$$Y=\beta Y+\gamma X+\varepsilon$$
(2)

where Y = observable-dependent variable; X = observable-independent variable; \(\varepsilon\)= disturbances/errors in the model; \(\beta\), \(\gamma\)=model co-efficient/parameter estimates. It can be understood from Eq. (2) that path analysis is based upon the assumptions of the classical regression analysis. In fact, Cervero [69] suggests that model estimates of unidirectional relationships replicate that of OLS. Since the theoretical foundations of the path model rest with regression analysis, such models can be estimated by removing non-significant paths and estimating the residual co-efficient using standard regression path co-efficient [68]. In the current study, path models were estimated using the SPSS AMOS 22 software package, which uses Maximum Likelihood Estimation (MLE) to assess the degree of influence of the independent variables on pedestrian volume.

Results

Landuse and Network Assessment

It was ascertained from the road network inventory and site visits, that out of the total 1180 km of roads [70] in Varanasi, less than 5 km of them have a sidewalk on one or both of the sides. Hence, it is a common practice for pedestrians to walk on the side of the carriage-way and utilize the road network. The GPS-based recording yielded 63 m of dead ends with no interconnection to other streets. Such streets were removed from the analysis. The hand-held GPS units were also used to record off-street trails used by only pedestrians, which were around 20 m in length and accounting for only 0.6% of the total 297 links across the three study areas. Therefore, separate analyses of this pedestrian network were ignored due to its meager presence, contrary to the suggestions by Chin et al. [71].

The landuse maps generated from the GIS mapping exercise shows that the study location #1 (Fig. 4) (adjacent to the Banaras Hindu University campus) consists of 52 links and has a total area of 0.438 sq. km out of which 62.8% were residential, 12.3% were commercial, and 5.6% were institutional landuse. Out of the total commercial area, 91.7% of the establishments were located on major streets, wherein 42.4% of these establishments are situated on a 452 m long stretch (only section of the road network equipped with sidewalks). This suggests that the afore-mentioned road stretch is a potential local trip attractor within the study location. The reconnaissance survey of the area also revealed that the resident university students walk along this stretch for marketing and leisure trips.

Fig. 4
figure 4

Generated landuse map for study location #1 in Varanasi

Study location #2 is the largest among the three locations with an area of 0.576 sq. km and consists of 160 links. Only 3 out of the 160 links have paved at-grade shoulders (informal sidewalks) for pedestrians. At the heart of this location is the intersection of two important streets (in the form of an ‘X’). The proportion of commercial landuse (34.1%) is higher than residential (29.6%) and institutional (22.3%). The commercial landuse here is concentrated along the ‘X’-shaped corridor. The reconnaissance survey of the area also found that this location housed affluent commercial establishments, such as shopping complexes, motor vehicle showrooms, luxury restaurants, etc., along with important institutional buildings, such as head post-office, government schools, etc. This location also serves as an important transfer point for auto-rickshaws.

Study location #3 consists of 78 links and was close to a regional railway station and an inter-city bus terminus. Pedestrian facilities at this location are almost non-existent. This location is the smallest among all three with an area of 0.297 sq. km. The proportion of commercial (48.2%) and institutional (40.4%) landuse was significantly greater than the residential area (11.3%). The reconnaissance survey discovered that these commercial establishments comprised of hotels, tourist lodges, etc., and inexpensive accommodations were located further away from the major roads of the location. This indicated a substantial demand for such facilities in the location due to passenger populations (from railway and bus terminus) who require temporary lodging.

Statistical Correlations with SS Indices

The Pearson’s ‘r’ statistics were drawn between the 30-min average pedestrian volume collected from the 297 links and the 45 SS indices listed in Table 2 to see if a significant relationship exists. It was found that Integration, at R = 10 steps, (J from Table 2) was highly correlated (r = 0.6, p = 0.00) with pedestrian volume, followed by NAIN (r = 0.57, p = 0.012), and Choice (r = 0.51, p = 0.03). This indicates a positive inclination of the pedestrian volume to the connectivity of the streets within the three networks/locations. High correlations have also been noted by other researchers, in this aspect [31, 36, 37, 39]. Integration is the least angular distance taken to reach a particular link (destination), which depicts the ‘to-movement’ potential of the street. Choice is the least angular distance taken using a particular link (selecting a route) to reach their destination and therefore depicts the through-movement potential. Streets with the highest Integration are the easiest to reach, since they are at the least angular distance apart with respect to other links in the network. Similarly, the link with the highest Choice is a part of the most suitable route, since using this route a pedestrian will have to cover the least angular distance to reach a destination. This is why pedestrians tend to use links with high Integration and high Choice. This aspect has been noted and confirmed by various research studies [72,73,74,75] which is the primary reason behind the application of Space Syntax theories to urban transportation studies (SS also has application in studying architectural layouts). Weeding out SS indices bearing insignificant and weak correlation values with pedestrian volume, 27 out of the total 45 SS indices were taken forward for further investigations as their correlations values were found to be in the range of 0.38 to 0.60 and were also statistically significant.

The representation of two such SS indices for Location #1 is shown in Fig. 5. The outputs depict link values in a color-coded scheme which ranges between blue (cold color, low) to red (warm color, high). Integration and NACH maps, shown in Fig. 5, exhibited higher (red/orange) values on the major streets, consisting of commercial landuse and high pedestrian volumes. This is a typical characteristic of an organic street morphology (i.e., non-geometric, non-grid street layout) observed in developing economies, where commercial landuse are located along streets so as to be easily accessible to customers. These characteristics of the street morphology are further understood by Pearson’s correlation drawn between ‘connectivity’ and ‘integration’. Such a measure is called ‘intelligibility’ [34], and it describes how clear and direct is the walking network for its users. A low intelligibility value indicates a poor spatial cognition and way-finding ability to its users [76, 77]. Choudhary and Adane (2012) measured the intelligibility values for five Indian cities and for Varanasi they found it to be 0.026, which was close to the current study’s finding (r = 0.014, averaged across the three locations). This indicated that the three study locations in Varanasi had a poor way-finding capability to its users.

Fig. 5
figure 5

Representation of two Space-Syntax indices for Location #1

Stepwise MLR Results

As indicated in “Statistical Analyses”, classical regression analysis using stepwise MLR was carried out to identify SS indices having a significant impact on pedestrian volume. Across all the three locations, C, D, E, and F indices from Table 2 exhibited moderate to high correlation (between 0.46 and 0.53) with A and B. In contrast, I, J, K, L, and M were moderately correlated (between 0.38 and 0.47) to G and H. Therefore, to counter this multi-collinearity issue, different sets of stepwise MLR models were developed such that these correlated explanatory variables did not appear together. Over 450 different MLR models were tested, out of which, only the significant and best fit models estimated using SPSS 22 are presented in Table 4. This table shows that model 4 is the best suited for all the 297 data points (links) across the three locations. Significant factors which positively influenced pedestrian volume were ROW, presence of sidewalk, NAIN, and commercial area. That is to say, commercial areas alongside highly connected links (i.e., high NAIN values) and with sidewalks, will considerably impact pedestrian movement. It is interesting to note that, despite SS index ‘integration’ received the highest correlation with pedestrian volume as per the results described in “Statistical correlations with SS indices”, the second-most correlated SS index NAIN was found to be significant from regression results. Path models were developed to further explore model 4 in Table 4 and the results are explained in the following section.

Table 4 Stepwise MLR outputs

Path Model Results

The path models for this study was hypothesized under three scenarios, using the SPSS AMOS package, and their structures and standardized path coefficients have been presented in Fig. 6. An inherent advantage of using path analysis is that it allows for testing various hypothesis concerning the path structure and the influence of various endogenous and exogenous variables. The three scenarios used in this research has been formulated with the following justifications:

  • Scenario 1: No hypothesis, i.e., there is no indirect relationships/mediating variables existing in the path structure. The only endogenous variable is ‘pedestrian volume’ and is impacted by exogenous variables—‘NAIN’, ‘ROW’, ‘Commercial landuse area’ and ‘presence of sidewalk’. This path model emulates the MLR model so created in “Stepwise MLR results” (model 4 from Table 4) and was done to see if both models corresponds to the each other. It was reassuring to see that the path coefficients presented in Fig. 6a were the same as the standardized regression coefficients of model 4 shown in Table 4.

  • Scenario 2: Kang [38] suggests that “more people tend to walk along well-connected streets with a higher density of commercial and retail spaces”. This will further influence the development of more commercial establishments along streets where the ROW is wide. Clearly, the presence of ‘commercial area’ and ‘pedestrian movement’ does not have a straightforward relationship. It was thus, hypothesized that ‘ROW’ affects ‘pedestrian volume’ indirectly while mediating through ‘commercial area’; i.e., the path structure presented in Fig. 6b consists of endogenous variables, ‘pedestrian volume’ and ‘commercial area’ and exogenous variables—‘NAIN’, ‘ROW’, and ‘presence of sidewalk’. ‘ROW’ impacts ‘pedestrian volume’ while mediating through ‘commercial area’ (i.e., indirectly) to represent the findings from Kang [38]. However this is just one of the way the path structure can be modeled, another hypothesis that can be formulated is presented as Scenario 3.

  • Scenario 3: Another version of Kang’s [38] results can be considered as ‘ROW’ affecting ‘pedestrian volume’ directly (i.e., no mediation) as well as indirectly (i.e., mediation through ‘commercial volume’). This path structure is presented in Fig. 6c. This scenario shows that endogenous variable ‘pedestrian volume’ is impacted both by another endogenous variable, ‘commercial area’, which in turn is affected by ‘ROW’. Additionally, ‘pedestrian volume’ is also influenced by exogenous variable ‘ROW’, directly. Such impact chains are incomprehensible from standard MLR models.

Fig. 6
figure 6

Path model standardized outputs for the three scenarios

In addition to the above scenarios, 15 other path structures were hypothesized involving the remaining exogenous variable (e.g., ‘NAIN’ influencing ‘pedestrian volume’ directly and indirectly, etc.), but either the path coefficients turned out be statistically insignificant or the parsimonious fit statistics/ quality parameters did not achieve the acceptable levels. Hence, they were not presented here.

All the models consisted of statistically significant variables at 95% confidence interval. Parsimonious fit indices, such as Goodness of Fit Index (GFI), Adjusted Goodness of Fit Index (AGFI), chi-squared, Normed Fit Index (NFI), Comparative Fit Index (CFI), and Akaike Information Criterion (AIC) [79], were referred to judge the validity of the developed models. Along with these fit indices other quality parameters, such as degrees of freedom (df), Chi-square (\({\upchi }^{2}\)) values, root mean square error of approximation (RMSEA), and co-efficient of determination (\({\mathrm{R}}^{2}\)), are presented in Table 5.

Table 5 Fit indices and other parameters for the three path model scenarios

Although Scenario 1 was noted to be the same as MLR model 4 of Table 4, the parsimonious fit indices (GFI, AGFI, NFI), for this path model was seen to be low as per Williams & Holahan (1994) [79]. Moreover due to the absence of any mediating endogenous variable, parsimonious indices, such as CFI and AIC, were unable to produce any realistic estimates. Scenario 2 depicts that ROW indirectly affects pedestrian volume by mediating its effect through the commercial areas. Its AGFI was 0.720, which is less than the recommended value of 0.90 [80, 81] and is significantly lower than the GFI of 0.99. This indicates that Scenario 2 path model has too many exogenous variable required to explain the process and a simpler (or alternative) model should be favored. Moreover, as seen from Table 5, \({\upchi }^{2}\)/df ratio is 9.2 and is above the acceptable value of 5 or less [82]. On the contrary, Scenario 3 suggests that ‘ROW’ affects the ‘pedestrian volume’ both directly and indirectly and has an AGFI of 0.985; \({\upchi }^{2}\)/df ratio was seen to be well below 5. Interestingly, both Scenario 1 and Scenario 3 have the same R2 value indicating a capability to explain 56% of variance in the acquired data, of which Scenario 1 is a path model emulation of model 4 MLR estimate shown in Table 4, as mentioned earlier. Thus, Scenario 3 seems to be a better explanation of the underlying path structure, one which explains the same amount of variance as the MLR model but with a better comprehension of impacts on ‘pedestrian volume’.

The decomposition of the standardized path coefficients for Scenario 3 is presented in Table 6. ‘ROW’, ‘presence of sidewalk’, ‘NAIN’, and ‘commercial area’ impacts ‘pedestrian volume’ with positive total effect. The total influences are in the order, ROW > presence of sidewalk > NAIN > commercial area. All the variables other than ‘ROW’ impacts ‘pedestrian volume’ directly. The direct effect of ‘ROW’ on ‘pedestrian volume’ is 0.36 whereas the indirect effect (mediating through the commercial area) is 0.07. The total effect of ‘ROW’ on ‘pedestrian volume’ is thus (0.36 + 0.07 =) 0.43 and indicates that if the ‘ROW’ increases by one standard deviation, the pedestrian volume also increases by ‘0.43’ standard deviation. This may be because a higher ROW is generally associated with more accessible and important connections between destinations. Such important connections are corridors frequented by higher footfall and therefore potential business locations for commercial establishments.

Table 6 Decomposition of standardized path coefficients for Scenario 3

Discussion and Practical Applications of Research

While it is widely evident [44, 83, 84] that provision of infrastructure is an effective way of promoting walking, the current study was able to demonstrate that walking infrastructure placed along a well-integrated street is vital for improving pedestrian movement. Specifically, the study results are more suited for commercial areas in an Indian city, since the results were not statistically significant for residential and institutional land use. However, this is does not hamper the interpretation because commercial areas are generally the most accessible location in any Indian city. The Indian market place (also called bazaar in Hindi) is not only characteristic of an economic center but also the driving force behind the city’s social cohesion. Naturally, market places accumulate large congregation of people who carry out their activities on foot. However, a city may have multiple commercial activity zones which can range from a smaller area (consisting of maybe just one street) to the zone being spread over a neighborhood/area. The results of the path model from this research are applicable to larger commercial areas, one which is spread over a network of roads/pedestrian links. Moreover, this analysis conducted for Varanasi may be considered as a representative example for any Tier-2 Indian city with a dense population, streets with narrow ROW, and a largely absent/neglected walking built environment. The following sections provide brief discussions based on the results of the analysis.

NAIN as a Credible Index for Identifying Integrated Streets

NAIN is found to be the most suitable indicator to understand the level of interconnectedness in the pedestrian network, considering its ability to influence pedestrian movement. It was seen that NAIN was highly correlated with pedestrian volume (r = 0.57, p = 0.012), making it the second-most correlated SS index after ‘integration’ (r = 0.60, p = 0.012). Despite this, the MLR models were able to identify NAIN as a statistically significant predictor of pedestrian volume rather than ‘integration’. This reiterates NAIN’s ability to be a credible index for identifying integrated streets. A higher NAIN value is always attributed to a more integrated network and vice-versa. It can be used to compare the spatial structure (link patterns and configurations) of different cities (and also within cities) since this is a normalized measure [35, 85]. As per the study findings, 52.7% of the links that had an average NAIN below 0.40 were observed to have lower pedestrian volumes (i.e., 120 pedestrians per 30 min) and were located on residential/ neighborhood streets links. However, the remaining 47.3% of the links that were located on commercial and institutional areas had an average NAIN value of more than 1. Moreover, 68.3% of the total links which were adjacent to commercial landuse have a NAIN ranging from 0.87 to 1.21 and accumulated an average of over 1800 pedestrians per 30-min. However, on the downside, NAIN is a unit less measure with no maximum or minimum value. This is why researchers have failed to define benchmark values for “good” connectivity. Since NAIN is comparable across cities, a good practice would be to compare cities with similar road network topologies. Varanasi’s (and for most tier 2 Indian city) urban street network evolved organically, hence is composed of irregular grid geometry with sharp angular changes. The study compares the NAIN obtained from Varanasi to that from other cities analyzed by Hillier et al. [85] is presented in Table 7.

Table 7 NAIN values from cities around the world

The literature search was unable to find other SS studies similar to Hillier et al. [85], which reports NAIN values and at the same time studies a geographically varying sample. While the Hillier et al. [85] study is almost a decade old, it must be noted that street networks in an urban area are rarely altered (i.e., rarely are new links created between streets). Street network may be expanded spatially but the geometric configuration (i.e., angular change, etc.) usually remain constant. NAIN values may only vary over long periods (over centuries), as noted by Liao et al. [86], where the researchers studied the street development of historic towns in China.

Hillier et al. [85] compared the NAIN of 50 different cities globally and found that NAIN values varied between the ranges of 0.375 to 2.764. From Table 7, NAIN values are seen to be higher for cities with gridded street topology rather than organic ones. However, there is wide variation even for cities with an organic street pattern. Such cities are closer to Indian cities in terms of topology. The lowest NAIN was observed in Venice followed closely by Middle Eastern cities which were much less connected and integrated. Tokyo too was seen to have an irregular (organic) grid structure, however, its NAIN was higher than that of Beijing or Kyoto, implying higher connectivity in spite of an organic topology. Following these observations, the NAIN values for Varanasi sits between the range of NAIN values for Venice (0.32), the Middle Eastern (0.43 to 0.54), and Asian (0.76 to 1.2) cities. The NAIN for Varanasi was seen to vary between 0.37 and 1.25, with a mean of 0.77 and a standard deviation of 0.15. Thus, based on these observations from Table 7, a value of more than 0.9 can be considered “good” connectivity for a pedestrian network, at least for Varanasi. Therefore, streets with a NAIN value above 0.9 may consist of potential business locations and may be graded according to their NAIN magnitudes in the inventory listings.

Connectivity Impact on Pedestrian Movement and Local Economy

Street links with ROW of 15 m accounted for 33.8% of the total links adjacent to commercial landuse and had a pedestrian volume above 1500 pedestrian per 30 min. The maximum volume of pedestrians per 30-min was seen to be 2614 as per Table 1. Moreover, 88.36% of these links had an average NAIN value over 1.06 (a ‘high’ value for Varanasi being 0.9, as shown in earlier sections). This suggests that street links with wider ROW and higher NAIN also has significant pedestrian volume. In contrast, 52.7% of the links that were related to lower pedestrian volumes (i.e., less than 120 pedestrians per 30 min) were located on residential/ neighborhood streets links and had an average NAIN below 0.40. Evidently, ROW and NAIN affects the pedestrian volume in the commercial areas of Varanasi, thereby indicating a requirement for prioritizing walking facilities in such areas. This observation is consistent with the ‘theory of natural movement’ wherein highly integrated streets induce the flow of people, and the land alongside such streets are made attractive for economic activities [35, 36]. Urban streets in India are categorized by ROW into arterial, sub-arterial, collector, and local streets [88]. A wider ROW street does not just accommodate more vehicles for movement but provides higher accessibility to a range of social and economic functions (which includes shopping, socializing, etc.). Walkable routes are a necessity to support these functions since a customer generally arrives at a shop on foot. Moreover, Volker and Handy [89] provide a comprehensive review of 23 studies which suggests a positive economic impact on retail and food businesses due to investments in pedestrian infrastructures. Augmenting the walking infrastructure (including the provision of sidewalks or improving the existing walking condition) may help in boosting the sales of commercial establishments, thereby improving the local economy [90, 91].

Identifying and creating inventories of such streets will be an asset since they may represent lucrative business locations. One street may be more connected than the other and so the demand for acquiring a location on such street might vary. For example, only 5 out of the 91 links in study area 1 of Varanasi was seen to have formal sidewalks on both sides of the street. The average NAIN was above 1.18 and the pedestrian volume was above 1800 pedestrians per 30 min. Needless to say, these street links were located adjacent to a commercial area. On the contrary, another stretch with 5 links located adjacent to a commercial area in study area 2, with an average NAIN value of 0.98, was seen to accumulate an average of 1375 pedestrian per 30-min. This potential may be exploited toward asset monetization by the urban local bodies (conditioned upon the land being owned by such authorities) wherein land/plots can be leased out at a variegated rent price (based on location) to competing businesses. To capture this untapped potential, it is of utmost importance to understand the degree of connectivity for such streets. This can be accomplished using the NAIN index. Asset monetization has recently been in the Indian government’s focus owing to the National Monetization Pipeline Plan [92], however, this mechanism has been observed earlier in the city bus service sector. For example, it is a common practice for BRT (Bus Rapid Transit) authorities to lease out space for advertising purposes to businesses. Surat, Ahmedabad, and Pune BRT systems use the outside surface of BRT vehicles and spaces in depot infrastructure to host colorful posters for business advertisements at pre-defined tariffs [93,94,95]. A similar mechanism is followed by other BRT authorities for commercial development within bus terminus premises.

Prioritization Tool for Enhanced Urban Mobility Planning

Vision and goals for urban transportation planning in Indian cities are currently drafted through Comprehensive Mobility Plans (CMPs) developed by Urban Local Bodies (ULBs), transportation authorities, and development agencies. These planning documents provide a city level overview of existing transportation systems and formulate sustainable transportation scenarios such that future urban transport projects can be identified. Despite a section dedicated to NMT, CMPs report only infrastructure level overview of sidewalks, crosswalks, etc. The importance of network planning is often overlooked and is not represented in such reports. Although a CMP preparation toolkit is drafted [96] to assist cities, it does not make any provision for network-level assessment of pedestrian facilities/street networks. This study recommends policy-makers and practitioners that pedestrian network assessment be prioritized and included as a part of the CMP. More specifically, SS analysis may be included as a mandatory component since it can be an effective tool before any field work is undertaken. NAIN values estimated from such an analysis gives a fair idea regarding the network connectivity. It would be a good starting point to identify ways of increasing the number of interconnections to existing links so as to enhance the NAIN values. Of course, creating new links in an urban area is very difficult, thus the practitioner may opt for identifying links which may have been blocked or encroached upon.

In terms of complexity, the effort required to compute SS indices is extremely low. Practitioners are required to feed only the GIS layer of the road network to the open-source DepthmapX software to obtain these SS indices. The GIS layers are generally developed by ULBs and are reported in the CMPs.

Practitioners may take advantage of the high NAIN values (for e.g., > 0.9 for Tier-2 Indian cities with organic street morphology, as shown earlier) while planning, designing, and implementing walking infrastructure in commercial areas. Since wider ROW induces commercial activity (as evident from “Path Model Results”), it is likely that such types of streets will be highly accessible and well-connected to other streets in the network. These streets can be identified and earmarked for sidewalk provision or upgradation and give policy-makers and local authorities a thumb-rule for prioritization while allocating funds.

To summarize the practical implications of this research (Fig. 7) from the preceding discussions, practitioners may consider a pedestrian network analysis using the SS techniques. Various open source softwares (such as Depthmap X) are available to assist them in estimating the NAIN value for each link in a commercial area walking network. Once the highly connected links are identified, three interventions can be formulated—(a) reporting these highly connected links in the existing planning documents; (b) retrofitting or upgrading these links with quality walking BE (such as complete street designs, pedestrian street, etc.), and; (c) identifying vacant commercial plots on such links for asset monetization, such that the revenues generated could be diverted toward other developmental activities (such as replenishing the urban transport fund, supporting public transit services, etc.).

Fig. 7
figure 7

Practical implication: action plan for practitioners

Conclusion and Way Forward

The study carried out a pedestrian network connectivity analysis with the help of the Space-Syntax (SS) technique to determine the role of pedestrian network BE in influencing walking. Three pedestrian networks were selected from Varanasi and were digitized in a GIS environment with the help of satellite imageries, GPS and multiple field visits. The analysis proceeded by firstly identifying the most significant SS indices out of the 45 SS indices that were calculated using an open source Depthmap X plugin and their correlation to pedestrian volume were assessed. Second, 27 out of the 45 SS indices were taken forward to estimate multiple stepwise MLR models with pedestrian volume as the outcome variable, along with other predictor variables, such as landuse, ROW and presence of sidewalk. Finally, path analysis was conducted to identify the path structure and the total direct and indirect influence of the exogenous variables on the endogenous variable. The study concluded that Normalized Angular Integration (NAIN) was the most suitable SS index, which impacted pedestrian volume and can be used as a credible index for measuring pedestrian network connectivity. The path models suggested that ROW affects the pedestrian volume both directly and indirectly in a highly integrated commercial area where pedestrian infrastructure is in place. It is intuitive that commercial landuse will influence higher pedestrian movement. However, this study shows that the relationship is not straightforward when considering the walking network connectivity, i.e., pedestrian volume is not just influenced by wide ROW or presence of pedestrian infrastructure but such conditions should be available along urban links with high connectivity as well. Furthermore, the study also observes that highly integrated links are located on commercial landuse which attracts higher footfall. It was seen that 88.36% of commercial links with more than 15 m ROW, that had an average NAIN value over 1.06, also accumulated an average pedestrian volume above 1500 pedestrian per 30 min. However, 52.7% of the links that were related to lower pedestrian volumes (i.e., less than 120 pedestrians per 30 min) were located on residential/neighborhood streets links and had an average NAIN below 0.40. The research findings support the fact that network connectivity is an essential aspect of the built walking environment and should not be addressed in seclusion, rather it should be addressed at the stage of planning, designing, and implementation, so as to better influence pedestrian movement. This strategy will have economic connotations while planning, redeveloping or enhancing the built walking environment in commercial areas.

The current study is a novel attempt to relate pedestrian movement with SS assessments in an Indian city. Previous studies [54, 78] conducted urban morphological change analysis and did not forward any comment regarding pedestrian movement or the built walking environment. It is seen that although typical Indian cities are largely devoid of walking infrastructure, network connectivity can play an important role in influencing pedestrian movement. This confirms the initial supposition that infrastructure provision can influence pedestrian volume, but it is further enhanced if the sidewalk infrastructure is a well-connected system as a whole. A limitation of the study is that both the regression and the path models do not consider the quality of the sidewalks. This may be the reason why both models were unable to explain more than 56% of the variance observed. The inclusion of quality variables (such as width of sidewalk/footways, esthetics and walking surface quality, degree of encroachment, continuity, slope, etc.) may further improve the performance of the models.