1 Introduction

Walking is a basic as well as traditional mode for movement. It is associated with all other modes of transportation. For sustainable development of urban transportation system, pedestrian facilities are to be treated as an essential component. To develop a pollution-free, safe, convenient and comfortable transportation system, pedestrian facilities need to be improved. Traffic congestion and increasing rates of accidents are major problems in developing countries because of increasing rate of motorized vehicle and lack of proper planning. There is a need of modal shift to non-motorized transportation system which is possible only by providing better facilities to the users. Analyses of pedestrian flow characteristics are required to evaluate existing condition of walking facilities. Traffic characteristics can be defined based on macroscopic and microscopic approaches [13]. A D May has been described that the modelling phenomenon between vehicles and pedestrians are differentiated by numerical and units only [4].

Fruin [5] observed pedestrian flow characteristics based on macroscopic approach, which was adopted by TRB in 1985. Pedestrian flow characteristics can be analysed using three parameters, namely speed (U), flow (Q) and density (k). As per basic traffic flow theory, relationships among three principle variables of traffic flow (Eq. 1) are used to derive the traffic flow characteristics (speed–density, flow–density and speed–flow). Here speed has been chosen as a function of density to describe the relationship between speed and density (Eq. 2).

$$Q = U \times k,$$
(1)
$$U = f(k).$$
(2)

The objective of this study was to observe classical relationships between pedestrian flow parameters based on field data considering deterministic and artificial neural network (ANN) approach on sidewalks for heterogeneous condition in Indian cities. The deterministic models define average system behaviour considering physical laws. Based on these classical models and flow parameters such as the jam density or optimum density, free flow speed or optimum speed was derived to describe pedestrian characteristics. Greenshield [6] proposed a linear relationship between speed and density, whereas Underwood [7] proposed an exponential relationship between speed and density.

In 1943, Artificial Neural Network (ANN) concept was firstly proposed by Warren McCulloch and Walter Pitts [8]. The vehicular pollution models were developed using ANN to predict air pollution concentration in urban environment [9], and vehicle delay estimation model was also developed considering ANN approach [10]. Flow prediction model and pedestrian tracking system were developed in pedestrian study area using ANN approach [11].

In this study, ANN is used to analyse the relationship between pedestrian flow parameters, which is more realistic and more capable of capturing the traffic dynamics. Also ANN model is proposed for modelling pedestrian flow based on observed pattern of field data, and ANN model is validated by comparing with other deterministic models by performing various statistical analyses.

2 Review on past studies

Mathematically, pedestrian flow characteristics are defined in terms of speed–density models based on macroscopic approach since 1960. In different countries, many researchers studied pedestrian flows at different pedestrian facilities such as walkways, sidewalks, movements in central business district (CBD) areas and movements under unidirectional or bidirectional flows or under mixed traffic conditions. Most of the researches have observed a linear relationship between speed and density [1222]. Rahman et al. [23] developed speed–density relationship based on ordinary least square (OLS) and weighted regression methods to observe pedestrian characteristics in Dhaka. Parida et al. [24] observed exponential regression model as a best fit to sidewalk movement in Delhi. Quadratic relationship between reciprocal of walking speed and pedestrian density was developed by Al-Azzawi and Raeside [25] for sidewalks in the UK.

The relationship between pedestrian flow parameters can be described using macroscopic fundamental diagram (MFD). Pedestrian flow parameters such as free flow speed, jam density, optimum density, optimum speed and capacity can be estimated using MFD. For deterministic approach, all variables are calibrated from mathematical models based on basic relationship in Eq. (1). In previous studies, pedestrian density was reported by the jam density or as optimum density. Estimation of optimum density or jam density was observed by many researchers [1215, 19] considering speed–density relationship as linear, but there is no assurance for the existence of jam density and free flow speed for every situation, which can be deduced from traffic flow data and these fundamental models. The estimated value of optimum density is 2.08 P/m2 at an intermodal transfer terminal in Calcutta [26], at a comprehensive Transport Terminal in Beijing is 1.64 P/m2 [27], at confined passageways of metro station in Shanghai is 1.53 P/m2 [28], near Anand Vihar Inter State Bus Terminal, New Delhi, is 1.89 P/m2 [29], for side walk in Dhaka is 1.85 P/m2 [23] and at level walkways inside a DTSP hall is 2.22 P/m2 [30]. Contrary to the previous studies, available space for pedestrians at maximum flow situation is less than the required space. Space was calibrated considering a linear relationship between speed and density in most of the previous studies.

Artificial Neural Network (ANN) is defined by Dr. Robert Hecht-Nielsen as “A computing system made up of a number of simple, highly interconnected processing elements, which process information by their dynamic state response to external inputs”. This system is capable of machine learning as well as pattern recognition to their adaptive nature. Florio and Mussone [31] evaluated the flow–density relationship of a motorway section to define the time and spacing stability or instability of its motorized traffic flow. Zhao and Thorpe [11] used stereo-based segmentation and neural network-based recognition for detecting pedestrians. Most of the researchers [3133] focussed on traffic flow prediction based on ANN approach. ANN for short-term prediction of traffic volume was developed using past traffic data on NH-58 [34]. Sahani and Bhuyan [35] used ANN clustering to define LOS levels. In this paper, ANN approach is used for the development of relationship between flow parameters.

3 Data collection and extraction methodology

Data were collected for sidewalks and carriageways around transport terminals in Roorkee, Dehradun and Kolkata. Videography method was adopted to collect data to characterize pedestrian movement. Data were collected for 8–10 h on week days. Layout of one of the selected test sections for data collection is shown in Fig. 1. The camera was fixed at a vantage point so as to obtain an overall view of the test section. Trap section for the study was marked with self-adhesive yellow road tape to make it visible in the video. Peak hour data were analysed for evaluation of pedestrian characteristics on sidewalks. Peak hour was defined after observing 16-h pedestrian demand survey at every study location. Details of collected data in Dehradun Railway station are shown in Fig. 2 with the 16-h observation on hourly pedestrian traffic demand on a week day and 4-h morning and evening peak data profile considering 30-s measurement interval. For Howrah bridge terminal, pedestrian traffic flow variations from morning 9 am to 11 am and evening 4 pm to 6 pm are shown in Fig. 3. In Roorkee terminal, morning 1-h and evening 1-h peak flow variations are shown in Fig. 3. Fundamental relationships between flow parameters were developed for carriageway movement using 4-h data and for sidewalk movement using 6-h data.

Fig. 1
figure 1

Layout of data collection trap in Dehradun railway station

Fig. 2
figure 2

Collected data in Dehradun Railway station. a Hourly flow variation at Dehradun railway station. b Peak hour flow variation at 30-s interval

Fig. 3
figure 3

Flow variation in morning and evening peaks in 30-s time interval. a During 4 h at Howrah bridge terminal. b During 2 h at Roorkee railway terminal

Pedestrian flow parameters were extracted manually from videos. Manual data extraction is no doubt time consuming but ensures the accuracy of data. Speed and flow data were extracted directly from videos, and density was estimated using fundamental traffic flow equation (Eq. 1). Das et al. [36] optimized data extraction technique for analysis of pedestrian flow on sidewalks. The method of data extraction in this study was adopted from the aforesaid study. Data were extracted at 30-s measurement interval. Flow was observed by counting the number of pedestrians crossing the mid-section of trap in 30-s time interval and converted into flow rate. Speed data were estimated dividing the length of the trap by the travel time taken by the pedestrians to cross the trap which was observed from videos.

Sample size is measured in terms of collected data points during peak hour’s movement of pedestrians. In speed data extraction, randomly 5 pedestrians were selected at 30-s measurement interval. The average travel times of selected pedestrians were used to obtain average speed during 30-s time interval in terms of m/min. Details about sample size are shown in Table 1. The required sample size was calculated by

$$n = \left( {\frac{z\sigma }{E}} \right)^{2},$$
(3)

where n is the sample size, z is the standard normal variable, σ is the standard deviation of sample and E is error.

Table 1 Details of collected samples

4 Statistical description of observed data

This section describes the characteristics of pedestrian flow and speed on pedestrian facilities. The statistical summary of observed flow characteristics of pedestrian on sidewalks is given in Table 2. Table 2 demonstrates that standard deviation and variation in flow data are larger than observed speed data. Cumulative speed distribution is illustrated in Fig. 2. Using Scott rule (Eq. 4), 15th, 50th and 85th percentile speeds were estimated from the cumulative probability distribution curve (Fig. 4). Estimated values of 15th, 50th and 85th percentile speeds are 80.5, 73.28 and 63.8 m/min, respectively. The nature of speed distribution is measured in terms of speed ratio (SR). Calibrated speed ratio (Eq. 5) is 0.99, which indicates that the nature of speed distribution curve is bell shaped for observed speed on sidewalks.

Table 2 Statistical summary of observed pedestrian flow data
Fig. 4
figure 4

Cumulative probability distribution for pedestrian speed

$${\text{BinWidth}} = \frac{3.5\sigma }{{n^{1/3}}},$$
(4)
$${\text{SR}} = \frac{{S_{85} - S_{50} }}{{S_{50} - S_{15} }}.$$
(5)

5 Modelling the relationship between pedestrian flow parameters using deterministic & ANN approach

Relationship between pedestrian flow parameters were modelled with the evaluation of macroscopic flow parameters analytically. First stage is “model development” to observe fundamental relationships between principle flow parameters. In the next stage, flow parameters such as free flow speed (U f), optimum speed (U m), optimum density (k m), jam density (k j) and capacity (q m) were estimated from the fundamental relationships (Eqs. 6 and 7). Theoretically, free flow speed can be defined as the speed that occurs when density and flow are zero. Also free flow speed is used to describe the average speed for pedestrian when no congestion or other adverse conditions exist. Jam density occurs in no-flow condition, i.e. when movement is not possible. Capacity can be defined in terms of maximum rate of flow on the sidewalk. Density and speed at capacity are defined as optimum density and optimum speed, respectively. Optimum density and speed can be estimated from the MFD. Here speed–density relationship is considered as a fundamental relationship because of better understanding and simplicity of model. Correlation coefficient of speed–density is 0.87 for sidewalks. Flow–density, speed–flow and flow–space relationships were calibrated from basic speed–density relationship for deterministic models.

5.1 Deterministic modelling

Deterministic single-regime speed–density fitted models have been used to observe characteristics of data, and those models are presented in Eqs. 6 and 7. Free flow speed, jam density and capacity were determined using the developed mathematical models assuming a basic linear relationship for conventional approaches.

Greenshields’ Model (1935):

$$U = U_{\text{f}} - \left( {\frac{{U_{\text{f}} }}{{k_{\text{j}} }}} \right)k.$$
(6)

Underwood Model (1961):

$$U = U_{\text{f}} \text{e}^{{ - {\raise0.7ex\hbox{$k$} \!\mathord{\left/ {\vphantom {k {k_{\text{m}} }}}\right.\kern-0pt} \!\lower0.7ex\hbox{${k_{\text{m}} }$}}}} .$$
(7)

Calibrated models for speed–density relationship with the estimated flow parameters for sidewalks are given in Table 3. Estimated flow–density and flow–space relationships are given in Table 4. Figures 5 and 6 show MFD of these three models for sidewalks and carriageway movement using deterministic approach. Estimated optimum density and capacity can be observed from these MFDs.

Table 3 Calibrated deterministic models around transport terminals
Table 4 Relationships between pedestrian flow parameters using deterministic models
Fig. 5
figure 5

MFD of pedestrian flow models on sidewalks around transport terminal (conventional approach). a Speed–density model. b Flow–density model. c Speed–flow model

Fig. 6
figure 6

MFD of pedestrian flow models on carriageways around transport terminal (conventional approach). a Speed–density model. b Flow–density model. c Speed–flow model

5.2 ANN approach

ANN approach is adopted in this study to develop pedestrian flow relationship to introduce nonlinearity phenomena rather than conventional approaches. Deterministic models are made of passive data structures. Theses data structures are normally manipulated by an active procedures. Neural network models show global system behaviour observed from local interactions. Learning process of ANN model follows an input–output mapping and adapts their synaptic weights. Using NF tool in MATLAB, five ANN models were developed and the details are given in Table 5. A neural network model consists of processing elements (neurons) and connections (links). The use of models based on neural network approach is efficient and practical as they facilitate their own implementation and learning based on real data. Network is referred as a layered network where hidden units lie between input and output units. Architectural view of a typical neural network is shown in Fig. 7.

Table 5 Performance of ANN models based on pedestrian flow relationships
Fig. 7
figure 7

Structure of neural network (Ref. [37])

In this study, a two-layer feedforward network trained with Levenberg–Marquardt algorithm is used for analysis of ANN models. Feedforward networks consist of a series of layers, and each subsequent layer has a connection from the previous layer. The final layer produces the network’s output. During the process, 85 % data for training and 15 % for validation were used for analysis of ANN models. The sigmoid function was used for hidden neuron activation. Mainly, feedforward computation consists of simple run, product and sigmoid evaluation. Levenberg–Marquardt backpropagation (trainlm) algorithm was used as a network training function which is the fastest backpropagation algorithm. Network performance was measured according to the mean of squared error (MSE). In the used network, sigmoid transfer function was used in the hidden layer and a linear transfer function in the output layer. It can be observed from Table 5 that ANN 4 model gives better performance as compared to the other three ANN models in terms of R value and performance measure. R represents measures of strength of the relationship between dependent and independent variables. Graphical representation of fundamental pedestrian flow models using ANN 4 model is shown in Fig. 8 for sidewalk facility. Figure 9 represents fundamental pedestrian flow models using ANN 3 model for carriageway facility.

Fig. 8
figure 8

Pedestrian fundamental flow relationships on sidewalks around transport terminal using ANN. a Speed–density model. b Flow–density model. c Speed–flow model

Fig. 9
figure 9

Pedestrian fundamental flow relationships on carriageways around transport terminal using ANN. a Speed–density model. b Flow–density model. c Speed–flow model

“Capacity is reached when the product of density and speed results in the maximum flow rate” [38]. This condition actually provides information about optimum speed, optimum density and maximum flow rate which can be determined using MFD (Figs. 8, 9). Optimum density is 1.60 P/m2, optimum speed is 47.54 m/min and capacity is 76.06 P/min/m, which are determined using best fitted ANN model, i.e. ANN 4 model for sidewalk movement. Observed optimum density is 1.6 P/m2, capacity is 90.20 P/min/m and optimum speed is 56.38 m/min for carriageway movement.

6 Validation of models

Validation is an essential part of modelling which demonstrates that the model is a reasonable representation of the actual system. Coefficient of correlation, coefficient of determination, MAE and RMSE are used for analysing model validation. RMSE represents the sample standard deviation of the differences between predicted values and observed values. These values are estimated using Eq. 8. MAE is another useful measure for model evaluation (Eq. 9). The calibrated models and estimated RMSE and MAE values are given in Table 6 for sidewalks.

Table 6 Accuracy measurements for model performance and evaluation
$${\text{RMSE}} = \sqrt {\left( {\frac{{E_{t} - O_{t} }}{N}} \right)},$$
(8)
$${\text{MAE}} = \frac{1}{n}\sum {E_{t} - O_{t} }.$$
(9)

It is observed from the estimated R values that Underwood model gives better fitness among two deterministic models for both the facilities. But considering RMSE and MAE values, Model I gives better fitness for sidewalk facilities and Model II gives better fitness for carriageway facilities. In sidewalk facilities, RMSE value is 5.06 m/min for Model I and 5.11 for Model II. In carriageway facilities, RMSE value is 5.29 m/min for Model I and 5.20 for Model II. In view of LRM fitness, Model I gives better fitness for sidewalk facilities and Model II gives better fitness for carriageway facilities. Based on these results, it can be concluded that Model I gives better fitness for sidewalk facilities and Model II gives better fitness for carriageway facilities between two deterministic models.

Deterministic models cannot capture the complete variations of real scenario. Pedestrian characteristics are practically not properly followed by conventional, i.e. deterministic, approach in real scenario. But in ANN approach, a learning procedure is adopted for network to update network architecture and connection weights to perform efficiently. ANN develops underlying rules from the collected field data and it trained network architecture.

In the comparison of ANN models, ANN 4 model gives better performance considering overall R, MSE, RMSE and MAE values for sidewalk movement. Estimated RMSE value for ANN 4 model is 3.83 P/min/m considering speed–density, and MAE value is 4.73 m/min. For carriageway movement, ANN 3 model gives better fitness of observed data. Estimated optimum density value is 1.60 P/m2, optimum speed is 47.53 m/min and capacity is 76.06 P/min/m as per ANN 4 model for sidewalks. Estimated optimum density value is 1.60 P/m2, optimum speed is 56.38 m/min and capacity is 90.20 P/min/m as per ANN 3 model for carriageways. Optimum speed as per best fitted conventional approach, i.e. Greenshield model, is 39.51 m/min, and capacity is 104.69 P/min/m for sidewalk movement of pedestrians. For carriageway movement, best fitted model is Underwood model considering conventional approach, and estimated optimum speed and capacity are 30.35 m/min and 84.38 P/min/m, respectively. Required space as per model 1 for sidewalk movement is 0.38 m2/P, and 0.36 m2/P at capacity is very less which is not possible in real world. Because pedestrian space includes body size, sway and distance between two pedestrians. Calculated space at capacity for movement of pedestrians is 0.63 P/m2 for both the facilities around transport terminal. It may be observed that in transport terminal area pedestrians are carrying baggages and baggages will require more space.

Scatter plots for best fitted ANN model (ANN 4 for sidewalk and ANN 3 for carriageway) are shown in Figs. 10 and 11. Calibrated R values for best fitted model are 0.756 and 0.763 consequently considering LRM (Table 7) for sidewalk and carriageway movement, respectively, which represent better fitness of observed data in ANN model.

Fig. 10
figure 10

LRM between observed and predicted speed for sidewalks and carriageways (ANN model). a Observed and estimated speed for sidewalk. b Observed and estimated speed for carriageway

Fig. 11
figure 11

LRM between observed and predicted flow for sidewalks and carriageways (ANN model). a Observed and estimated flow for sidewalk. b Observed and estimated flow for carriageway

Table 7 Comparison of observed R values of LRM for sidewalks and carriageways

7 Conclusions

Interrelationship between pedestrian flow parameters can be explained quantitatively using macroscopic flow diagrams. Models are developed based on two approaches such as deterministic and artificial neural network. Through speed–density, flow–density and speed–flow models, ANN approach gives more suitable and realistic nature of relationships of pedestrian flow parameters. An ANN model is proposed to observe relationships between input and output parameters by learning from a number of input patterns and their associated output patterns. In this study, backpropagation algorithm is used for fast learning procedure with the activation of hidden neurons. The measure of accuracy in terms of performance and validation of models is compared statistically for both the approaches. Statistical analysis includes correlation coefficient (R), coefficient of determination (R 2), RMSE, MAE and relationship between observed and predicted values of flow parameters to observe better performance of the model. It has been observed from deterministic approach that the maximum value of R is 0.73 for Greenshield model and 0.76 for ANN 4 model for sidewalk movement around transport terminal. Furthermore, for carriageway movement around transport terminal the deterministic approach resulted in an R value of 0.77 for Underwood model among two deterministic models and 0.773 for ANN 4 model for sidewalk movement around transport terminal. Based on RMSE and MAE values, the best model is selected, which can describe pedestrian flow characteristics in a real way. ANN 4 model for sidewalk movement and ANN 3 model for carriageway movement provide better fitness on comparing it with other models which can analyse the relationships between flow parameters in real scenario considering these statistical measures. Also using LRM, it was observed that ANN 4 and ANN 3 models give better fitness to predict data as compared to deterministic model. From the abovementioned results, it can be stated that ANN gives best performance considering statistical measures rather than conventional approach. ANN model’s performance is entirely based upon the data set so to develop a good ANN model, and sufficient data need to be collected.