Introduction

Much effort and attention have been directed towards the investigation of vehicle-to-pedestrian (V2P) communication systems [1, 2]. Pedestrians refer to people walking on a street. V2P systems serve different purposes, such as safety or convenience for pedestrians [3]. In view of providing a clear view of pedestrian safety systems, this study presents six questions.

The first question is “Why are pedestrian safety and pedestrian behaviour important, and is pedestrian safety in V2P systems considered a concern at the current state?”.

V2P systems employ different communication technologies and various mechanisms to facilitate the interaction and information exchange between pedestrians and vehicles [4]. Statistics show that more than 3000 people die daily [5]. Accidents are unexpected events, and the increasing number of road accidents has led to an increasing number of pedestrian fatalities [6,7,8,9,10,11,12,13]. The common causes of such accidents are dangerous driving, inattentiveness, misbehaviour, and error by pedestrians and vehicles. These factors negatively impact human safety. Pedestrian fatality has become the main safety concern all over the world, and it primarily explains the need to take safety seriously. Pedestrian misbehaviour and inattention whilst walking have also been considered as causes of accidents [5, 14]. Walking is the most essential but the least protected mode of road transport. Several people favour the usage of smartphones when walking, especially with the fast development of smartphone-based applications, such as social media applications, such as music and video players, game machines and book and magazine readers, all of which have extensively dominated the usage of smartphones. Such usage tends to distract pedestrians [15]. Distraction levels (e.g. ‘texting, watching a video or talking on the phone’) were elaborated in Ref. [16]. Moreover, smartphone usage has been identified as one of the reasons for pedestrians’ inattention as pedestrians tend to stare at their smartphones whilst walking on the street; such smartphone users are generally more dangerous than other pedestrians who do not stare at their smartphones [17]. In addition, some pedestrian movements during specific actions/activities (e.g. ‘stopping, walking, waiting, running or crossing a curb’) are considered risky behaviours [18, 19]. Therefore, pedestrians who use smartphones whilst walking are referred to as pedestrians with aggressive behaviour, which is based on either pedestrian behaviour distraction or movements during specific activities. Given this context, the second question is “What is the current research scenario for pedestrian behaviour to realise pedestrian safety and address research gaps?”

In recent years, relatively new studies have focused on pedestrian behaviour to realise pedestrian safety. The proposed model in Ref. [20] enjoins spatial formation to ensure effective navigation performance and connects the impacts of nearby environments to walking trails by using fuzzy logic. The work aimed to predict a pedestrian’s walking path by modelling a built environment for the pedestrian’s steering behaviour. A particular problem associated with walking path prediction, namely, ‘how a pedestrian chooses his/her next step position and speed when he/she is exposed to environmental stimuli during a normal and non-panic situation’, was solved in Ref. [21]. The work attempted to analyse pedestrians’ behaviour in terms of selecting crossing facilities when faced with signalised crosswalks and footbridges at intersections. Field video data were used to observe the pedestrians’ crossing behaviour. Despite the incessant measures taken in large cities to advance road safety, pedestrian run-overs (risky pedestrians) remain a major problem. A skilled fuzzy system, as presented in Ref. [22], records a decrease in the number of run-overs on pedestrian crossings with the growing use of reinforcement systems. The study analysed the travel speed of incoming vehicles and the space between vehicles and pedestrians. In Ref. [23], a method was used to model the diverse and subjective nature of environmental perception by analysing the movements of various pedestrians (i.e. their movement in indoor areas). The pedestrians’ behavioural features, such as signal violation behaviour, running behaviour, lane changing features, waiting time before crossing and risk perception, were reported to play meaningful but varying roles in ‘unsafe’ and ‘safe’ intersection clusters. In Ref. [24], a GIS-based methodology was proposed, and suitable indicators were developed and tested to collect and process the data required for the analysis of pedestrian crossing behaviour during urban trips. Specific patterns represented ‘the tendency to cross at the beginning of the trip and the tendency to cross at midblock locations when signalised junctions are not available’.

As highlighted in the previous discussion, no study has focused on pedestrian walking misbehaviour on the basis of pedestrian behaviour distraction or movements during specific activities. Such inadequacy is considered a research gap. Pedestrian fatalities in the urban areas of evolving countries, such as India, range between 40 and 60% of total road traffic fatalities [2, 21]. In Russia, approximately 170,000 road traffic accidents were recorded according to the ‘Statistics of Road Traffic Accidents in 2017’ [25]. This number is three to four times that recorded in Europe. Annually, pedestrians contribute to 53,000 road traffic accidents, of which 20,000 cases are related to misbehaviour. One in three pedestrians are injured, and one in six deaths is caused by pedestrians being hit by a car on pedestrian crossings [25]. Thus, the third question is “What are the challenges and issues in pedestrian safety which are associated with pedestrian walking misbehaviour based on pedestrian behaviour distraction or movements during specific activities?”

Numerous studies [2, 20,21,22,23,24] have comprehensively explored pedestrian safety, but no work has yet to analyse the real-time data of pedestrian walking misbehaviour on the basis of either pedestrian behaviour distraction or movements during specific activities to realise pedestrian safety for positive (normal) or aggressive pedestrians. Practically, pedestrian walking behaviour should be recognised, and aggressive pedestrians should be differentiated from normal pedestrians. This type of pedestrian behaviour recognition can be converted into a classification problem, which is the main challenge for pedestrian safety systems. In overcoming the challenge of classification, three issues should be considered.

First, the factors associated with pedestrian behaviour distractions or movements during specific activities that exert a strong effect on pedestrian safety should be identified (i.e. factor identification) by extracting the important features of walking pedestrians whilst recognising their positive and aggressive behaviours [15,16,17,18,19].

Secondly, real-time data on pedestrian walking behaviour should be collected (i.e. data collection) by sensor-based systems, such as accelerometers and computer vision-based systems. The first system type is not always reliable because of the unfixed sensor positioning whilst the second type is limited by poor visibility conditions (e.g. effects of night-time and bad weather conditions), the far locations of pedestrians (e.g. within tens of meters) or the non-line-of-sight positions with respect to sensors [26]. Accordingly, smartphone-based sensors have become popular as they can provide the balance between accuracy and usability of obtained data.

Thirdly, in terms of exchange data, recent V2P safety applications have started to use wireless communication instead of sensors for exchanging data [5, 27,28,29]. Wireless communication can provide 360-degree awareness via information exchange between two entities [30, 31]. Some classification matters dependent on client–server architectures perform classification tasks particularly used for scalability and reliability. Since its development, wireless communication has suffered from two issues, namely, network congestion and server failure. Congestion is considered one of the most severe phenomena affecting the reliability of data transmission in networks, and it causes server failure [32]. Network congestion causes either network failure or server failure. The research on server failure has shown that some V2P applications dependent on client–server architectures encounter disruptions in their V2P communication networks and that the server side can cause link outage, which potentially leads to severe consequences [31]. Thus, pedestrian walking behaviour recognition is a pressing issue, especially in the event of network failure. The present study raises the following fourth question and provides an analytical response to address the above issues: “What are the criticisms and technical gaps in the academic literature on pedestrian safety systems based on pedestrian walking behaviour classification?

Three aspects must be considered in proposing pedestrian safety systems on the basis of pedestrian walking behaviour classification: identification of factors, collection of data, and exchange of data in the contexts of wireless communication and network failure [5, 27,28,29,30,31,32,33]. Accordingly, three studies about pedestrian safety systems based on pedestrian walking behaviour classification are presented [26, 34, 35].

In the context of factor identification, the articles [26, 34, 35] that dealt with the issue of pedestrian behaviour classification did not consider the importance of identifying the reasons behind pedestrian distractions, such as mobile phone usage or aggressive activities (e.g. running in the street). These factors are important and considered as the main criteria on which the data collection process depends.

With regard to data collection, various activity recognition approaches have been proposed in several studies. Such studies can be divided into two categories on the basis of the methods of data collection. The pattern classification of images is the most traditional method used for pedestrian data collection. An existing dataset comprises 4000 pedestrian patterns and 5000 nonpedestrian patterns which measure 18 × 36 pixels in size and are cropped from video sequences [35]. The study can be largely classified as image processing research. However, using images as a way to identify different activities has a major drawback; that is, it violates the privacy of users. Many applications involve such a problem. The work in Ref. [34] used sensory data for activity recognition using a publicly available dataset, which is not reliable. In Ref. [26], gyroscope and accelerometer sensors were used to collect data despite previous studies reporting the many disadvantages of using multiple sensors in identifying human activities, including the need for a large amount of sensor data for preprocessing (signal segmentation, feature selection, extraction and feature reduction); such requirement is time-consuming and thus influences classification accuracy, leading to an imbalance between accuracy and usability of obtained data. Therefore, gyroscopes are the most widely used sensors for activity identification, and they are generally incorporated in wearable devices (e.g. smartphones). However, no study has reported that using gyroscopes alone provides a balance between accuracy and usability of obtained data.

With regard to data exchange in wireless communication and network failure contexts, previous works on pedestrian classification indicated that machine learning (ML) requires a ‘very large number of data, especially when the dimension of the data increases significantly, and the data required for accurate analysis increases dramatically’. As the amount of data increases, the computational cost also increases exponentially. This phenomenon is the reason behind the need to use feature selection and feature extraction. The works in Refs. [26, 34, 35] did not cover all these requirements. In Ref. [34], ML and principal component analysis were used to recognise daily human activities. In Ref. [35], a simple threshold technique involving the pattern classification of images was utilised; this technique can be adopted using first- and second-order statistics to determine whether a test pattern belongs to the pedestrian class or nonpedestrian class. A linear support vector machine (SVM) then validates the hypotheses that are most likely to contain a person. The author in Ref. [26] proposed an approach that works efficiently with limited hardware resources and provides satisfactory activity identification by using smartwatch sensor data based on hybrid feature selection models. Most existing studies developed modules for pedestrian behaviour classification by using ML techniques when a server is available (wireless communication); meanwhile, no study has developed any pedestrian behaviour classification module for cases wherein no server is available (network failure).

According to the above discussion, the prominent issues related to pedestrian walking behaviour classification should be addressed. Therefore, a new pedestrian walking behaviour classification based on pedestrian behaviour distraction or movements during specific activities should be established and motivated to realise pedestrian safety. Thus, this study asks the fifth question: “What is the recommended solution?”

A novel approach to pedestrian walking behaviour classification in wireless communication and network failure contexts is proposed herein. To address the first question, this work identifies several factors that drive irregular walking behaviour of mobile phone users by constructing a questionnaire that can determine users’ options (attitudes/opinions) about mobile usage whilst on the street. For the second question, four different testing scenarios are developed to acquire the real-time data of pedestrian walking behaviour by using gyroscope sensors. The proposed intelligent approach was based on one of two modules. The first module was developed for pedestrian behaviour classification using ML techniques via wireless communication in cases in which servers are available. The second module was developed on the basis of four standard vectors to classify pedestrian walking behaviour in cases wherein no server is available; such fault-tolerant pedestrian walking behaviour classification is initiated when failures occur in a network. The sixth and last question is “What are the novelty and contributions of the present study?

The presented study can support pedestrian safety systems through its novel approach to pedestrian walking behaviour classification in the contexts of wireless communication and network failure. The contributions of this work can be summarised in the following points:

  1. 1.

    This study fills the gap in the identification of several factors driving the irregular walking behaviour of mobile phone users by constructing a questionnaire that can determine users’ options (attitudes/opinions) about mobile usage whilst on the street.

  2. 2.

    This study develops four different scenarios for collecting real-time pedestrian walking data to extract the important features of pedestrian walking behaviour.

  3. 3.

    This study develops a module for pedestrian behaviour classification using ML techniques for cases in which servers are available.

  4. 4.

    This study develops a module on the basis of four standard vectors to classify pedestrian walking behaviour when servers are unavailable.

Methodology

The proposed methodology is composed of three major phases (Fig. 1). First, the factor identification phase includes the design and distribution of the questionnaire about mobile phone usage, analysis of questionnaire data and identification of aggressive behaviour. Second, the data collection phase involves the identification of all requirements and scenarios for pedestrian data gathering. Third, the data exchange phase includes preprocessing, pedestrian behaviour classification (two types of classification: using either ML with a server or performing statistical calculations when no server is available) and the validation and evaluation of results. Figure 2 presents the sequence of the proposed approach.

Fig. 1
figure 1

Flowchart of the study

Fig. 2
figure 2

Sequence of proposed approach

Phase I: factor identification

The literature review shows that one of the scenarios that distract pedestrians whilst on the street is the use of mobile phones. In this scenario, pedestrians can be classified as aggressive, and they represent a type of risk on the street. Therefore, to design and distribute the questionnaire about mobile phone usage, we determine the number of mobile phone users and their purpose for using their phones whilst walking on the street by adopting an electronic questionnaire (Google Form), in which the items are based on mobile phone users’ attitudes/opinions (i.e. questionnaires are an active and low-cost research tool for gathering data from respondents [36]). The questionnaire consists of 12 questions about mobile usage whilst walking. The survey questions are formulated with the assistance of experts. The questionnaire involves two parts. The first five questions cover personal information, which the respondents may opt to skip. The second part consists of questions related to the several uses of mobile phones, such as talking, sending messages and using GPS. The answers to these questions are analysed to identify the factors driving the irregular walking behaviour of pedestrians.

Phase II: data collection

Two stages comprise this phase. The first stage ("Identified data collection requirements") involves the collection of data on pedestrian walking behaviour whilst the second stage ("Identified scenarios for collecting data") is concerned about the proposed scenarios for collecting the data.

Identified data collection requirements

UPM University in Malaysia is selected as the site for the experiment as this university has a large number of students with mixed genders and ages. The work team consists of four persons, and they have been trained to learn exactly how the experiment is implemented and how the study can be explained to the participants in terms of collecting their walking data. As for the tools used in the experiment to gather data, a gyroscope sensor is utilised to record the signals of pedestrian walking behaviour. Each of four Samsung mobile phones is used to gather data for a particular scenario. A gyroscope is ‘a device used with respect to the Earth’s gravity to help determine orientation. Its design consists of a freely rotating disk called a rotor, mounted onto a spinning axis in the centre of a larger and more stable wheel. As the axis turns, the rotor remains stationary to indicate the central gravitational pull, and thus, which way is down’. Additionally, a gyroscope maintains its level of effectiveness by measuring the rate of rotation around a particular axis. By using the key principles of angular momentum, gyroscopes can help indicate orientation.

Identified scenarios for collecting data

Four different scenarios are identified for the collection of the required data. All the datasets are related to pedestrians carrying mobile phones whilst walking on the street. The aim of using different datasets is to identify pedestrians’ walking behaviours in real time and thereby classify them as either aggressive or normal pedestrians. The four datasets used in this study are listed below.

  • First scenario: normal walking. This scenario is related to the normal walking of pedestrians, in which mobile phones are kept in users’ pockets.

  • Second scenario: calling. This scenario involves pedestrians talking on the phone.

  • Third scenario: chatting. This scenario involves pedestrians conversing with other people.

  • Fourth scenario: running: This scenario involves pedestrians running on the street (this movement on the road is the most dangerous type).

Some cases of mobile phone usage whilst walking, such as listening to music, are excluded because they do not pose a danger and are not distracting for pedestrians. The first dataset is about normal running whilst the others are about aggressive walking, which can be described by calling, chatting or running. The four datasets are implemented on a number of participants whilst walking on the street for a few seconds as they carry their mobile phones in different situations. The data in the first test are recorded as soon as the participants walk normally whilst carrying a mobile phone in their pockets. In the second dataset, the gyroscope is used to record the data of walking participants as they use their mobile phones for chatting or sending messages. In the third dataset, the data of the participants talking on their mobile phones whilst walking are recorded. The same procedure is applied to the last dataset, in which the participants are instructed to carry their mobile phones and run on the street. Each of the four scenarios is implemented on a specific number of persons (males and females with different ages).

Phase III: data exchange

This section presents the preprocessing which involves data analysis and feature extraction ("Preprocessing") and classification using ML and Euclidian techniques ("Classification using machine learning techniques" and "Classification using Euclidean technique", respectively).

Preprocessing

Data processing is one of the most important steps in classification [34]. Its process consists of signal segmentation, feature selection, extraction and feature reduction. From a high-level perspective, statistics is a mathematical approach to achieving a technical analysis of a set of information. Statistics is ‘a robust instrument that can operate on data and produce meaningful information’. In other words, data analysis from a high-level viewpoint is made possible by producing technical data. Furthermore, statistics can produce visualisation graphs and in-depth statistical data, both of which are more information-driven and targeted towards reaching concrete conclusions, particularly for our data, in comparison with a simple projection. Subsequently, statistics can create relatively deep and fine-grained insights to depict exactly the approach to data structuring. The optimal and simultaneous utilisation of different data science techniques can help produce meaningful and in-depth information. The data collection process is completed using gyroscope sensors, which can record the walking signal data as time (t) and x, y and z. Then, the collected data are analysed to extract the required features in the time domain for the walking signals of the pedestrians by using the four steps shown in Fig. 3. The four scenarios are applied to all samples. The moving average method is used in this analysis to smoothen the pedestrian walking behaviour signals; here, each set of two or three adjacent cells over some period is averaged. The moving average is a technique used often in technical analysis.

  • First step: The segmentation technique is used to divide the sensor signals into small time window segments. In this manner, the feature can be easily extracted in each segment. The average of each two adjacent cells relative to t and x, y and z is determined by a sliding window (2 s time window). The same step is repeated for each set of the three cells, as represented by a 3-s time window. Thus, the outcomes are two- and three-window averages, as shown in Fig. 3.

  • Second step: Each two adjacent cells are subtracted from one another for each of the original data (2- and 3-window averages). In this step, the value and direction of the pedestrian movement axis measured by the gyroscope sensor are identified.

  • Third step: The positive and negative values are obtained after the subtraction step. In this step, these values should be separated from one another to obtain the positive and negative values for each of t and x, y and z, which refer to the direction of axis movement.

  • Fourth step: In the feature selection process, several signal characteristics are extracted from the raw sensory data. Time-domain features are widely used in feature calculation. Five traditional statistical calculations (min, max, mean, median, and standard deviation) are calculated for each sample from the four scenarios to extract their features.

Fig. 3
figure 3

Pre-processing steps

A feature is ‘a statistical function that works brilliantly to extract meaningful information of data in a natural way’. From the outlook of pedestrian behaviour recognition, a specific pattern is created from a specific physical movement of users. For example, the behaviour of a ‘running’ pedestrian has a specific pattern as the action includes superior physical effort from a pedestrian. It somewhat differs from the behaviour of a ‘walking’ pedestrian. Some inertial sensors, such as gyroscope sensors, can measure the intensity of each physical effort and produce different pattern distributions. Hence, the median, standard deviation or any other statistical feature is determined to underline the difference between the abovementioned two behaviours. In the feature extraction step ‘d’, a statistical feature from the three sets of axial data (x, y and z) of the gyroscope is extracted. In obtaining the maximum information, we extract five base features from the collects d data: MIN, MAX, MEAN, MEDIAN and STANDARD DEVIATION.

Classification using machine learning techniques

In this case study, the classification using ML techniques is implemented in the server when the connection with the server is available. ML methods can identify a new sample class by learning classified examples. A classification algorithm is provided with a training sample and its corresponding outcome; that is, each outcome represents a class of that sample, and each sample contains multiple attributes that carry information. Random forest and decision tree algorithms are the most popular classifiers. Random forest is appropriate for high-dimensional data modelling because it can handle missing values and continuous, categorical and binary data [37]. Bootstrapping and ensemble schemes make random forest sufficiently strong to overcome the problems of overfitting, and pruning trees is thus unnecessary. Moreover, in terms of high prediction accuracy, random forest is efficient, interpretable and non-parametric for various types of dataset [38]. The model interpretability and prediction accuracy provided by random forest is remarkably unique amongst popular ML methods. Accurate predictions and better generalisations are achieved with the use of ensemble strategies and random sampling [39]. Conversely, the strongest point of decision tree is made from the pre-classified data. The division into classes is decided upon the features that best divides the data. The data items are split according to the values of these features. This process is applied to each split subset of the data items recursively. The process terminates when all the data items in the current subset belong to the same class [38, 40].

Prepared data for classification in server

The proposed model in this research is trained using a supervised learning procedure. Thus, the category sets need to be labelled before training. These samples are labelled with a specific letter. A set of labelled data is used to learn and classify the features into their classes. The proposed approach is trained and evaluated using five category sets (Table 1) to obtain the best classification of pedestrian walking behaviour. The first category set consists of four classes, namely, normal walking, chatting, calling and running. The second one comprises two classes, namely normal walking and aggressive; the classes chatting, calling and running are labelled as aggressive. In the third category set, the ‘calling’ class is ignored, and only three classes are considered. The fourth category set entails normal and aggressive classes without ‘calling’. The fifth category set involves only the normal walking and running classes.

Table 1 Five categories sets used for training and validation
Training and validation of classification in server

In terms of pedestrian behaviour classification in the server, three experiments are conducted. In the first experiment, five datasets are trained using a well-known ML classifier, namely random forest. In the second experiment, attribute selection methods are used to identify the valuable features with random forest classifier. In the third experiment, optimisation techniques based on decision tree algorithm are used to improve the classification accuracy. Moreover, Experiments 1 and 2 are implemented in WEKA 3.8.3 software running on a Windows-based computer system with 2.50 GHz dual Intel(R) Core (TM) i7 and 8 GB RAM. The test options are set to tenfold cross-validation. The random forest parameter is fine-tuned using two parameters, namely sample split and number of trees. Then, the maximum results of the tuning is considered for training from fold 1 to fold 10 on the five selected category sets (Table 1). Experiment 3 is implemented in RapidMiner software 9.0.

  1. A.

    Experiment I: classification using random forest technique with whole features

In this experiment, five identified datasets mentioned in "Prepared data for classification in server" are tested using random forest classifier with whole features. In random forest, the features are randomly selected in each decision split. The correlation between trees is reduced by randomly selecting the features, which improves the prediction power and results in higher efficiency. A confusion matrix in the ML field has a specific table layout that is easy to determine if the system has two confusing classes. In this work, the matrix consists of a number of rows and columns. Each row of the matrix represents the instances of a predicted walking behavioural class, whereas each column represents the instances of an actual class (or vice versa). The diagonal in the matrix denotes a correctly classified instance. Practically, Rows 1–4 of the confusion matrix are labelled normal walking state, running state, calling state and chatting state, respectively. The confusion matrix indicates the number of classes that are correctly classified.

  1. B.

    Experiment II: attribute selection based on random forest technique

The feature selection process can be applied to reduce the number of features in the raw data [26]. This process is often expected to satisfy numerous requirements, such as short training time, high accuracy and real-time data generalisation. The process can also reduce confusion and misclassification. Then, all scenarios are tested again according to the feature selection method in ML with random forest classifier. Feature subset selection becomes quite important and predominant in the case of datasets containing a higher number of variables. Random forest has emerged as a relatively efficient and robust algorithm that can handle feature selection problem even with a higher number of variables. It is also considerably efficient in handling missing data imputation, classification and regression problems. In this study, we apply the concept of random forest algorithm on feature selection and classification [39]. The attribute evaluator is selected as an interface for the classes in view of evaluating the attributes individually. In this method, the evaluator identifies and deletes similar features to eliminate confusion and maintain the features that may affect the classification. Table 2 shows all the attribute evaluator types and the corresponding search methods.

  1. C.

    Experiment III: decision tree techniques based on grid optimisation

Table 2 Types of attribute evaluators and the search methods

Hyperparameters play an important role in ML techniques because they closely influence the behaviour of the training techniques and they exert a direct impact on model performance. Therefore, hyperparameter optimisation is a critical task. The best performance in the classification test can still be determined. The design shown in Fig. 4 is adopted to enhance the classification. The optimise operator finds the optimal values of the selected parameters for the operators in the subprocess. The optimise parameter (grid) operator is a nested operator. It executes the subprocess for all combinations of the selected values of the parameters and then delivers the optimal parameter values through the parameter set port. The performance vector for the optimal values of parameters is delivered through the performance port whilst the associated model (if any) is delivered through the model port. Additional results of the best run are delivered through the output ports. The identification of the optimal parameters is based on the performance value delivered to the inner performance port. The inner performance port can be used to log the performance of the inner subprocess. A log is created automatically to capture the number of runs, the parametric settings and the main criterion or all criteria of the delivered performance vector depending on the parameter log of all the criteria. This setup can be disabled by deselecting log performance. The inner performance port is also used to determine the best model upon comparing the fitness of the performances of the different iterations. The decision tree technique is used with this optimiser to enhance the classification accuracy. A decision tree is a tree-like collection of nodes intended to create a decision on value affiliation to a class or an estimate of a numerical target value. Each node represents a splitting rule for one specific attribute [38]. For classification, this rule separates values belonging to different classes. The building of new nodes is repeated until the stopping criteria are met. A prediction for the class label attribute is determined depending on the majority of the dataset which reaches this leaf during generation whilst an estimation for a numerical value is obtained by averaging the values in a leaf. The dataset upon which the model is applied should be compatible with the attributes of the model. That is, the dataset should have the same number, order, type and role of attributes as the dataset used to generate the model.

Fig. 4
figure 4

Optimiser design

The split data operator utilises a dataset as its input and delivers the subsets of that dataset through its output ports. The number of subsets (or partitions) and the relative size of each partition are specified through the partition parameter. The sum of the ratio of all partitions should be 1. The sampling type parameter decides how the dataset should be shuffled in the resultant partitions. This operator differs from other sampling and filtering operators in the sense that it can deliver multiple partitions of a given dataset. An approach is first trained on a dataset by another operator, which is often a learning technique. Subsequently, the approach can be applied to another dataset. Generally, the goal is to derive a prediction on unseen data or transform the data by applying a preprocessing model. The dataset to which the model is applied should be compatible with the attributes of the model; that is, the dataset should have the same number, order, type and role of attributes as the dataset used to generate the model.

Classification is a technique used to predict group membership for data instances. In evaluating the statistical performance of a classification model, the dataset should be labelled, i.e. it should have an attribute with a label role and an attribute with a prediction role. The label attribute stores the actual observed values, whereas the prediction attribute stores the values of the label predicted by the classification model under discussion. The performance (classification) operator is used with classification tasks only, but it can automatically determine the learning task type and calculate the most common criteria for that type.

Classification using Euclidean technique

In this case study, the classification using the Euclidean technique is implemented in the pedestrian smartphone when the connection with the server is not available.

Prepared data for classification without server

A fixed criterion is established to build four standard vectors representing each of the dataset mentioned above. After completing the analytical steps and extracting all possible features, the AVERAGE is calculated for all features in each scenario. Five AVERAGES should be arranged vertically to create standard vectors, as shown in the succeeding equations. In Eq. (1), vector-1 represents the normal walking scenario. In Eq. (2), vector-2 represents the calling scenario. Vector-3 and vector-4 represent the chatting and running scenarios in Eqs. (3) and (4), respectively.

X, Y and Z are variables, where ∀ x\(\left\{{x}_{1}{,x}_{2},\dots ,{x}_{i} {,\dots x}_{n}\right\}.\)

\({\overline{x} }_{2\omega }{\overline{x} }_{3w}\) represent the average values for the two- and three-window moving average, respectively [41].

$${\overline{x} }_{2\omega }=\frac{{x}_{i}+{x}_{i+1}}{2}$$
(1)
$${\overline{x} }_{3w}=\frac{1}{3}\sum_{i=n}^{i+2}{x}_{n}$$
(2)

\({S}_{x}, {S}_{{x}_{2\omega }}, {S}_{{x}_{3\omega }}\) represent the subtraction values of the original data, two-window moving average and 3-window moving average, respectively.

$${S}_{x}={x}_{i+1}-{x}_{i}$$
(3)
$${S}_{{x}_{2\omega }}={\overline{x} }_{i+1}-{\overline{x} }_{i}$$
(4)
$${S}_{{x}_{3\omega }}={\overline{x} }_{i+1}-{\overline{x} }_{i}$$
(5)
$$F\left(s\right)=\left\{\begin{array}{c}{X}_{i}\quad if S\ge 0\\ {X}_{j}\quad if S<0,\end{array}\right.$$
(6)

where \({X}_{i}\) represents the positive value for x and \({X}_{j}\) represents the negative value for x [101].

$$ \begin{aligned} Q_{{xi}} & = \left\{ {{\text{min}}_{i} (X_{i} )|i \in I\;{\text{max}}(X_{i} )|i \in I\;{\text{mean}}(X_{i} )|i \in I\;{\text{meadian}}(X_{i} )}; \right. \\ & \quad |i \in I{\text{stdv}}(X_{i} )|i \in I\} , \\ \end{aligned} $$
(7)

where I, J = 200 represents the recorded data of the X-coordinate.

$$ \begin{aligned} Q_{{xj}} & = \left\{ {{\text{min}}_{j} (X_{j} )|j \in J{\text{max}}(X_{j} )|j \in J{\text{mean}}(X_{j} )|j \in J{\text{median}}(X_{j} )|} \right. \\ & \quad \left. {j \in J{\text{stdv}}(X_{j} )|j \in J} \right\} \end{aligned} $$
(8)

The same procedure is applied to the Y and Z variables.

$$ \begin{aligned} Q_{{Yi}} & = \left\{ {{\text{min}}_{i} (Y_{i} )|i \in I\;{\text{max}}(Y_{i} )|i \in I\;{\text{mean}}(Y_{i} )|} \right. \\ & \quad \left. {i \in I\;{\text{meadian}}(Y_{i} )|i \in I{\text{stdv}}(Y_{i} )|i \in I} \right\}, \\ \end{aligned} $$
(9)

where I, J = 200 represents the recorded data of the Y-coordinate.

$$ \begin{aligned} Q_{{Yj}} & = \left\{ {{\text{min}}_{j} (Y_{j} )|j \in J{\text{max}}(Y_{j} )|j \in J{\text{mean}}(Y_{j} )|} \right. \\ & \quad \left. {j \in J{\text{meadian}}(Y_{j} )|j \in J{\text{stdv}}(Y_{j} )|j \in J} \right\} \end{aligned} $$
(10)
$$ \begin{aligned} Q_{{Zi}} &= \left\{ {{\text{min}}_{i} (Z_{i} )|i \in I{\text{max}}(Z_{i} )|i \in I{\text{mean}}(Z_{i} )|i \in I{\text{meadian}}(Z_{i} )|} \right. \\ & \quad \left. i \in I{\text{stdv}}(Z_{i} )|i \in I\right\} , \end{aligned} $$
(11)

where I, J = 200 represents the recorded data of the Y-coordinate.

$$ \begin{aligned} Q_{{Zj}} = \left\{ {{\text{min}}_{j} (Z_{i} )|j \in J{\text{max}}(Z_{i} )|j \in J{\text{mean}}(Z_{i} )|} \right.{\text{ }}j \in J{\text{meadian}}(Z_{i} )|j \in J{\text{stdv}}(Z_{i} )|j \in J\} \end{aligned}. $$
(12)

Suppose that R represents the raw of the five extracted features (min, max, mean, median, standard deviation) from the variables X, Y and Z.

$$ R = Q_{xi} Q_{xj} Q_{Yi} Q_{Yj} Q_{Zi} Q_{Zj} $$
(13)

Suppose that Vn, VC, VCH, VR represent the normal walking, calling, chatting and running vectors, respectively [13].

$$ V_{n} = \frac{1}{N}\sum\limits_{{j = 1}}^{{j = N}} {{\text{min}}R_{j} } \frac{1}{N}\sum\limits_{{j = 1}}^{{j = N}} {{\text{max}}R_{j} } \frac{1}{N}\sum\limits_{{j = 1}}^{{j = N}} {{\text{mean}}R_{j} } \frac{1}{N}\sum\limits_{{j = 1}}^{{j = N}} {{\text{meadian}}R_{j} } \frac{1}{N}\sum\limits_{{j = 1}}^{{j = N}} {{\text{stdv}}R_{j} } $$
(14)
$$ V_{C} = \frac{1}{N}\sum\limits_{{j = 1}}^{{j = N}} {{\text{min}}R_{j} } \frac{1}{N}\sum\limits_{{j = 1}}^{{j = N}} {{\text{max}}R_{j} } \frac{1}{N}\sum\limits_{{j = 1}}^{{j = N}} {{\text{mean}}R_{j} } \frac{1}{N}\sum\limits_{{j = 1}}^{{j = N}} {{\text{meadian}}R_{j} } \frac{1}{N}\sum\limits_{{j = 1}}^{{j = N}} {{\text{stdv}}R_{j} } $$
(15)
$$ V_{{CH}} = \frac{1}{N}\sum\limits_{{j = 1}}^{{j = N}} {{\text{min}}H_{j} } \frac{1}{N}\sum\limits_{{j = 1}}^{{j = N}} {{\text{max}}H_{j} } \frac{1}{N}\sum\limits_{{j = 1}}^{{j = N}} {{\text{mean}}H_{j} } \frac{1}{N}\sum\limits_{{j = 1}}^{{j = N}} {{\text{meadian}}R_{j} } \frac{1}{N}\sum\limits_{{j = 1}}^{{j = N}} {{\text{stdv}}R_{j} } $$
(16)
$$ V_{R} = \frac{1}{N}\sum\limits_{{j = 1}}^{{j = N}} {{\text{min}}R_{j} } \frac{1}{N}\sum\limits_{{j = 1}}^{{j = N}} {{\text{max}}R_{j} } \frac{1}{N}\sum\limits_{{j = 1}}^{{j = N}} {{\text{mean}}R_{j} } \frac{1}{N}\sum\limits_{{j = 1}}^{{j = N}} {{\text{meadian}}R_{j} } \frac{1}{N}\sum\limits_{{j = 1}}^{{j = N}} {{\text{stdv}}R_{j} } $$
(17)

In the first equation, the average of all the minimum values of all the samples in the normal walking scenario extracted previously is obtained. The same step is performed for the max, median, mean and standard deviation values, and the same method is applied to the remaining three scenarios. The obtained vectors are tested, and their performance is evaluated to recognise aggressive pedestrian behaviour. Ten new samples for each of the four scenarios (normal walking, chatting, calling and running) are collected and analysed following the same procedure of analysis and feature extraction in "Preprocessing". The obtained features can be reduced by comparing all of the features for each scenario and determining which feature has a better effect than the rest of the features.

Performance evaluation for the classification without a server

Each of the new samples is compared with the four vectors by using the Euclidean technique. Equation (5) shows the square root of the sum of squares of the difference between each value in the sample and its corresponding vector; the value is then divided by the number of all the extracted features. This formula calculates the distance difference. The comparative results represent the lowest value referring to the closest scenario and the category to which it belongs.

$$d{x}_{d}{x}_{v}= \frac{\sqrt{\sum_{n=1}^{N}{({x}_{d}-{x}_{v})}^{2},}}{N}$$
(18)

where \(d{x}_{d}{x}_{v}\) = distance between two points, N = dimensional features, \({x}_{d}\) = the value of the sample data and \({x}_{v}\) = the value of the vector data.

Equations (6) and (7), respectively, present accuracy and precision, which are applied to the result of each test to obtain the number of samples classified correctly and the samples classified incorrectly.

$$\mathrm{accuracy}=\frac{\mathrm{TP}+\mathrm{TN}}{\mathrm{TP}+\mathrm{TN}+\mathrm{FP}+\mathrm{FN}}$$
(19)
$$\mathrm{precision}=\frac{\mathrm{TP}}{\mathrm{TP}+\mathrm{FP}},$$
(20)

where positive (p): class 1 is positive; negative (N): observation is not positive; true positive (TP): observation is positive and is predicted to be positive; False negative (FN): observation is positive but is predicted to be negative; True negative (TN): observation is negative and is predicted to be negative; False positive (FP): observation is negative but is predicted to be positive.

To the best of our knowledge, the precision of the classification to be obtained from the evaluation process should be acceptable; otherwise, the standard vectors are modified as a means of recognising pedestrian behaviours, particularly by performing feature reduction, to enhance the accuracy, correctly classify instances of the classification and subsequently prevent confusion.

Discussion of results

The results of the proposed approach for pedestrian walking behaviour recognition are presented in two sections. The first section provides the results of factor identification and data collection (“Results of factor identification and data collection”) whilst the other one details the results and discusses data exchange in wireless communication and network failure contexts (“Results of data exchange in wireless communication and network failure contexts”).

Results of factor identification and data collection

"Questionnaire data analysis" presents the results of the enquiry about mobile phone users. The results of the pedestrian walking data collection are presented in "Pedestrian walking data analysis".

Questionnaire data analysis

A number of challenges were encountered during the questionnaire data gathering. One of the challenges was the collection of data on age and gender. The participants hesitated to fill up the questionnaire because of privacy concerns. The other challenge was the lack of a correct understanding of some of the questions; this issue rendered the questionnaire inadequate, and it thus needed to be answered again. To overcome the challenges, we obtained an approval and authentication letter from the university and showed it to each participant who wanted to know the reason for the questionnaire or data collection. We also aimed to highlight that the data collection was meant for research purposes. Samples across different ages and gender were contacted (i.e. students and lecturers), and an electronic link was shared to three research groups by means of WhatsApp. Some of the data were collected in malls and Starbucks. The questions and their respective answers were then analysed to determine the impact of mobile usage on pedestrian behaviour, which can subsequently affect pedestrian safety. The basic data are illustrated in Fig. 5a. The number of questionnaire respondents was 262, and their age range was 18–55 years. Nearly two-thirds (63.6%) of the respondents were male, and 36.4% were females, as shown in Fig. 5b.

Fig. 5
figure 5

Number of participants; a ages of participants, b gender of participants

As shown in Fig. 6a, 68.3% of the 262 respondents use their mobile phones whilst crossing the street; this percentage is large and should be considered. Figure 6b lists the seven possibilities why mobile phones are used whilst walking on the street. According to most of the respondents, they use their mobile phones mainly for calling (66.8%), followed by chatting (48.1%). As shown in Fig. 6c, which is related to how many messages are sent during a walking trip, most of the respondents select the first choice. Moreover, the respondents usually send less than 10 messages, which comprise the highest percentage (68.6%) (Fig. 6d).

Fig. 6
figure 6

Results of questions: a main reason for use of mobile phone whilst walking; b ‘Do you use your mobile phone when you are crossing the street?’ c ‘How much time do you spend listening to music whilst walking?’ d ‘How many messages do you send whilst walking?’ e ‘Do you use your mobile phone whilst driving?’ f ‘Where do you normally keep your mobile phone when you are walking?

Figure 6e shows the five choices for drivers’ mobile usage. The result shows that most of the drivers never use their mobile phones whilst driving; some do use their phones to access GPS or answer a call. This finding indicates that pedestrians use mobile phones more than drivers and thus further proves that pedestrians have riskier behaviours on the road. The question shown in Fig. 6f is related to the place where mobile phones are kept whilst walking. Majority (79.7%) of the respondents selected the first choice, which is to put their mobile phones in their pants’ pockets. As for the last question (Fig. 7), 137 respondents use Internet browsing/applications or the camera within 15 min of walking on the street; this large number should be urgently considered in the future.

Fig. 7
figure 7

Time spent on using mobile phone features whilst walking on the street

In conclusion, the main goal of the questionnaire is to present a comprehensive view of the walking behaviours of mobile phone users. The results indicate that many respondents use mobile phones whilst walking on the street for multiple purposes. Specifically, the walking behaviours of mobile phone users whilst on the street present a threat and should be considered urgently.

Pedestrian walking data analysis

The real-time pedestrian walking data were gathered using four scenarios that were independent of the results of the questionnaire data. Many difficulties were encountered during the experiment in terms of data collection. One of these challenges was choosing the right place to implement the experiments that required an appropriate street which was long enough to implement the experiment, particularly the walking or running scenario. Another challenge was the difficulty in persuading people to participate in the experiment because of their apprehension. The target respondents had difficulty understanding the idea of the experiment, and the researcher had to spend much time explaining the experiment and answering all questions. An additional challenge was the required number of respondents to do the test and repeat the same actions each time. The collection of data from the respondents of various ages also proved difficult. The gender aspect was another issue as more females refused to take part in the experiment, especially in the running scenario. The implementation of the aforementioned multiple scenarios was equally problematic as it took too much time and some individuals ultimately refused to participate because of the changing weather conditions. Other individuals misunderstood the experiment objectives, leading to repetitive experiments and recording.

Overall, we were able to overcome the challenges. A total of 263 samples were collected from male and female respondents with ages between 20 and 59 years (Table 1). For all scenarios, the respondents eventually took part in the experiments after they were shown the approval letter from the university. All participants were instructed to walk for a few seconds in various test locations for each scenario, and the readings of the gyroscope sensor were recorded. Data were lacking in two categories (40–49 and 50–59 years old) because of the difficulty of collecting data from older people. The lack of data in the last scenario was due to the refusal of some females to run on the street for the experiment.

Results of data exchange in wireless communication and network failure contexts

The results of the preprocessing and feature extraction are elaborated in "Preprocessing and feature extraction analysis". The results of the classification with and without a server are presented in "Results of classification in server" and "Result of classification without server, respectively.

Preprocessing and feature extraction analysis

The collected data were analysed, and the walking signal features were extracted to establish four standard vectors, each one of which represented a specific pedestrian movement. The gyroscope sensors recorded the walking signal data in time (t) and x, y and z. Figure 3 shows the proposed design to implement the analytical steps mentioned in "Preprocessing"; the design was also adopted for the 263 samples. Each sample comprised five features (MIN, MAX, MEAN, MEDIAN and STANDARD DEVIATION), and each feature consisted of three parts (original data, two-window subtraction and three-window subtraction), with each part containing positive and negative values for each of (time t) and x, y and z. Consequently, time (t) was removed from the analytical data because it did not provide the impression of actual movement of pedestrians unlike the rest of the coordinates of X, Y and Z that relied on the extraction of pedestrian walking features. In the test, 90 original features were obtained for each sample. Five features and three-axial gyroscope data were also derived.

Results of classification in server

The training and validation for classification in server results of the five identified datasets are presented in three experiments. Results of Experiments I–III show the classification using random forest technique with whole features, attribute selection based on random forest technique and classification using decision tree with optimisation, respectively, which will be discussed as follows.

Experiment I result: classification using random forest technique with whole features

The random forest hyperparameters were fine-tuned to obtain the best outcomes. The hyperparameters, number of trees and sample split, had significant effects on the performance of random forest. The results of parameter tuning are shown in Table 3, where the parameter ‘number of trees’ was set to 100, 150, 200 and 250 and the ‘sample split’ was set to 5, 6, 7, 8, 9 and 10.

Table 3 Results of the parameters tuning

As shown in Table 3, the classification percentage using 100 trees ranged between 69.40% and 70.99%. The performance of the classifier was improved with 150 trees, and the best value obtained with nine sample split was 71.60%. With 200 trees and 10-sample split, the random forest classifier yielded the highest classification percentage up to 73.30%. Finally, by utilising 250 trees, the classification percentage varied from 69.40% to 70.68%, which was lower than when using 200 trees. Table 4 illustrates the findings of the classifiers with tenfold training for each category set, which presented the best evaluation measurement with the five categories.

Table 4 Results of the first experiment

As shown in Table 4, the random forest classifier with Category set V yielded the highest percentage of up to 87.05%, followed by 77.68% and 79.11%, which were achieved by random forest and random committee with Category sets III and IV, respectively. With Category set II, the random forest classifier obtained the lowest percentage of 70.98%. Figure 8 depicts the confusion matrix of the classifiers that performed the best across the five category sets. For category set I (Fig. 8), which consisted of four classes, the random forest classifier correctly recognised 15 out of 33 normal walking behaviours. Meanwhile, the random classifier misclassified the other 18 samples, which included 2, 4, and 12 samples classified as running, calling and chatting, respectively. For the running class, 26/32 were correctly classified whilst 5/32 and 1/32 were classified as normal walking and calling, respectively; 1 item was classified as chatting. With the calling class, the highest number (30/36) was sorted as calling. Only 1/30 for normal walking and running and 4/30 for chatting were classified. Finally, 25/33 were correctly classified as chatting; the rest of the items were classified as normal walking (5/33) and running (3/33). The same explanation applies to the rest of the matrices. For category set II, random forest correctly classified 16, 27, and 26 of the 33 samples as normal walking, running and chatting, respectively. The confusion matrix (Fig. 8c) showed that the random forest classifier identified 14 and 58 samples in category set III as normal walking and aggressive, respectively. With category set IV, 31 out of 33 samples were properly categorised as normal, with just 2 samples misclassified as running. In the case of the running class, 6 samples were incorrectly categorised as normal whilst 26 samples were correctly classified as running. The random forest with category set V correctly categorised 31 out of the 33 samples as normal, with just 2 samples misclassified as running. Five samples were incorrectly categorised as normal in the running class whilst 27 samples were accurately classified as running.

Fig. 8
figure 8

Confusion matrix for each dataset. A Category set I: random forest; B category set II: random forest; C category set III: random forest; D category set IV: random committee; E category set V: random forest

Experiment II result: attribute selection based on random forest technique

To execute our analysis with a small yet meaningful feature set, we examined the extracted features and performed a feature selection phase. This stage is particularly important because the resource consumption of the target devices must be optimised, such that our approach can run on those devices, which typically have restricted resources. Table 5 shows the classification results of the five category sets based on the feature selection method in ML and the classifier that obtained high percentage classifications from the first method. Moreover, the same criteria were applied from the first method, considering the number of attributes (features) selected from the original 90 to ensure a fast and easy classification process.

Table 5 Results of second method

As demonstrated in Table 5, the performance of the classifiers evaluated with the five datasets varied slightly. For Category set I, 69 features were selected and the good percentage achieved was 74.22%. Category set II with 71 features achieved 71.61%. Category set III achieved a classification percentage 78.68%. In Category set IV, the random forest classifier achieved up to 79.68% accuracy with 71 features selected. The performance of random forest classifier using Category set V with 75 features produced the highest classification percentage of 90.26%. In conclusion, the accuracy for each category set differed from the first and second method depending on the number of features selected.

Experiment III result: decision tree techniques based on grid optimisation

This section describes the results of the optimisation method. The optimum parameter set for the decision tree technique obtained in this experiment is shown in Table 6.

Table 6 Parameter set

The ‘weighting’ dataset was loaded using the retrieve operator, and then the optimise parameter (grid) operator was applied. The C and gamma parameters of the SVM operator of the optimise parameter (grid) operator were selected. The range of the C parameter (SVM.C) was set from 0.001 to 100,000. Eleven values were selected logarithmically in ten steps. Moreover, the range of the gamma parameter (SVM.gamma) was set from 0.001 to 1.5. Eleven values were selected logarithmically in ten steps. The possible values of the two parameters could reach 11; thus, 121 (i.e. 11 × 11) combinations were expected. The subprocess was executed for all combinations of these values, and it was iterated 121 times. In each iteration, the values of the C and/or gamma parameters of the SVM (LibSVM) operator were changed. The value of the C parameter was 0.001 in the first iteration. The value was increased logarithmically until it reached 100,000 in the last iteration. Similarly, the value of the gamma parameter was 0.001 in the first iteration. The value was increased logarithmically until it reached 1.5 in the last iteration. In the subprocess of the optimise parameter (grid) operator, the data were initially split into two equal partitions by using the split data operator. Then, the SVM (LibSVM) operator was applied to one of the partitions. The resultant classification model was applied using the apply model operator on the second partition. The statistical performance of the SVM model in the testing partition was measured using the performance (classification) operators. The nested operator was also used to log the performance and parameters for each iteration. We found that the optimal parameter set had the following values: SVM.C = 398.107 and SVM.gamma = 0.001. The values logged by the optimise parameter (grid) operator was considered to verify these values. The minimum testing error was 0.02 in the eighth iteration. The values of the C and gamma parameters for this iteration were the same as those in the optimal parameter set.

Figure 9 shows a visualisation of the decision tree for identifying the four classes in the tree shape and the percentage ratio of the classification test, which had 100% precision and recall of correct classification for all classes represented by normal walking, chatting, calling and running. In this method, we enhanced the classification process and increased the percentage of the correct classifications to 100% for each scenario.

Fig. 9
figure 9

Result of optimiser model

Result of classification without server

In this section, the results of the proposed model for pedestrian safety in the event of server failure are presented. Two subsections are discussed according to the number of features representing the vectors. "Results of the standard vector" presents the results of the standard vector evaluation before feature selection. "Standard vector evaluation after feature selection" discusses the results of the standard vector evaluation after feature selection. Each evaluation process is conducted for the 10 new samples of the normal walking, calling, chatting and running scenarios together with the four vectors.

Results of the standard vector

Four steps were carried out in the evaluation of the standard vectors. Each shaded cell with a red colour in the tables represent incorrectly classified samples.

As shown in Table 7, the test was conducted for the four vectors and the 10 new samples of normal walking. Equation (5) was calculated in this step. The comparative results shown in Table 7a implied that the precision of the classification according to Eq. (7) was 0%. The finding further suggested that none of the samples were correctly classified as normal walking. The 10 samples were incorrectly classified as chatting (7 samples) and running (3 samples), as represented by the red shaded cells in the table. Moreover, Table 7b shows that none of the samples were correctly classified as calling (0%). The samples were incorrectly classified as running (five samples) and chatting (five samples). As for the chatting samples, the comparative results (Table 7c) showed that all samples were correctly classified at a precision of 100% for the chatting scenario. For the running samples, the comparative results (Table 7d) showed that 80% of the samples were correctly classified for the running scenario. Only two samples were classified incorrectly as chatting. According to the results, the precisions for the first and second tests were zero, and the classification process was unsuccessful, unlike that for the third and fourth tests, in which the precisions were acceptable.

Table 7 Comparative study before feature selection of (a) normal walking samples, (b) calling samples, (c) chatting samples, (d) running samples
Standard vector evaluation after feature selection

On the basis of the observed results, the standard vectors were modified according to the feature selection, which was manually conducted. The five features of MIN, MAX, AVERAGE, MEDIAN and STANDARD DEVIATION were reduced into two features (MIN and MAX) for all the vectors, as shown in Eqs. (8)–(11) [13].

The feature selection was conducted to obtain the most adopted feature effort. Figure 10 shows the mean values (MIN, MAX, AVERAGE, MEDIAN and STANDARD DEVIATION) for each feature in all datasets for each scenario. The comparative results indicated that only two features (MIN and MAX) exerted the most effect on all signals. The value of the first feature (MIN) for the normal walking signal, as depicted by the upper and lower values of the signal, ranged between 0 and − 3. By contrast, the value of the MIN feature of the calling signal ranged between 0 and − 3.8. The same trend was observed for the running signal, whose value ranged between 0 and − 4.6. The value of chatting ranged between 0 and − 2.2. Therefore, on the basis of the comparison, the five features (MIN, MAX, AVERAGE, MEDIAN and STANDARD DEVIATION) could be reduced into two original features (MIN and MAX).

Fig. 10
figure 10

Comparison of means of walking behaviour signals

$${V}_{n}=\frac{1}{N}\sum_{j=1}^{j=N}{\mathrm{min}R}_{j} \frac{1}{N}\sum_{j=1}^{j=N}{\mathrm{max}R}_{j}$$
(21)
$${V}_{C}=\frac{1}{N}\sum_{j=1}^{j=N}{\mathrm{min}R}_{j} \frac{1}{N}\sum_{j=1}^{j=N}{\mathrm{max}R}_{j}$$
(22)
$${V}_{CH}=\frac{1}{N}\sum_{j=1}^{j=N}{\mathrm{min}H}_{j} \frac{1}{N}\sum_{j=1}^{j=N}{\mathrm{max}H}_{j}$$
(23)
$${V}_{R}=\frac{1}{N}\sum_{j=1}^{j=N}{\mathrm{min}R}_{j} \frac{1}{N}\sum_{j=1}^{j=N}{\mathrm{max}R}_{j}$$
(24)

Subsequently, the same four steps for the evaluation of the standard vectors mentioned previously were repeated after feature selection.

Table 8a indicates that 70% of the samples were correctly classified as normal walking, and only three samples were incorrectly classified as calling (1 sample) and chatting (2 samples). For the calling samples, 70% of the samples were correctly classified as calling (Table 8b). Only one sample was incorrectly classified as normal walking whilst two samples were incorrectly classified as running. Meanwhile, all chatting samples were correctly classified as chatting, and the precision was 100%, as shown in Table 8c. As for the running samples in Table 8d, 80% of them were correctly classified; only one sample each for calling and chatting was incorrectly classified.

Table 8 Comparative study after feature selection of (a) normal walking samples, (b) calling samples, (c) chatting samples, (d) running samples

Overall, the results indicated a clear improvement to the method of proper classification. The classification precisions of the first and second steps related to the normal walking and calling samples were 0% before feature selection and 70% after feature selection. The results did not change for the third and fourth evaluations. These findings indicated that apart from the percentage of correctly classified samples, the class of samples incorrectly classified should also be considered. Each sample belonged to three classes (running, calling and chatting), which may be considered aggressive. Therefore, any incorrectly classified samples in the three classes will not likely be considered a risk to road safety because in all cases, an aggressive state (i.e. running, calling or chatting) will be alerted. In sum, attention should be paid to not only the percentage of correctly classified samples but also the classes with samples that were wrongly classified. As any sample belonging to the three classes (running, calling and chatting) was considered as aggressive, any incorrectly classified sample would not threaten road safety because in all cases, an aggressive state (i.e. running, calling or chatting) will be alerted. In fact, only one situation presented a threat to pedestrian and driver safety, that is, when the aggressive samples were incorrectly classified as normal walking. In such a case, the driver will be alerted about the presence of a normal walking pedestrian, and this scenario leads to lack of attention, which then results in an accident.

Two limitations in this research may be addressed in future work. The first limitation relates to the data collection for the questionnaire, which involved only 262 respondents. The second limitation is the collection of pedestrian walking behaviour data that was limited by the specific categories of pedestrians, such as age (20–59 years old) and number (263 participants).

Comparative analysis with academic literature

For pedestrian behaviour classification, three articles focused on solving the problem of pedestrian behaviour classification. Table 9 summarises the main issues of relevant studies, including the current work. The comparison is based on three concepts. The first aspect relates to the factors for identifying irregular walking behaviour. The second consideration is about the data collected, including the primary dataset referring to the real-time collected data and the secondary data on the use of public data, dataset type, capture devices and number of participants. The third consideration is data exchange, which refers to the number of features, classification with and without a server and overall accuracy.

Table 9 Comparison with previous works

Table 9 presents the comparison based on three aspects, namely, factor identification, data collection and data exchange in wireless communication and network failure contexts.

In identifying the factors associated with pedestrian behaviour distractions, we build a questionnaire, analyse the questionnaire data and identify the aggressive behaviour characterised by the use of mobile phones or movements during specific activities. No previous study covered these factors (Table 9), whereas our work involved identifying such factors by using the developed questionnaire, which secured 262 opinions from the participants. Then, we identified the irregular walking behaviour of pedestrians.

For data collection, human activity identification was conducted in Ref. [35] by collecting the video data of an activity. However, a limitation emerged when the edge of the image showed a candidate pedestrian. This work used a secondary dataset that was dependent on a video recorded from a camera. The work in Ref. [34] used secondary data or a public dataset. The work in Ref. [26] collected data using gyroscope and accelerometer sensors, it had a limited number of participants (30 participants). In our proposed approach, data were collected from 263 participants with different scenarios.

In data exchange in wireless communication and network failure contexts, extracting the important features of walking pedestrians have a strong effect on pedestrian safety in view of recognising their positive and aggressive behaviours. Our proposed approach is obviously superior in terms of the number of features and the overall accuracy. As for the classification when a server is not available, it had not been explored in previous studies despite the server failure issue exerting a significant effect on communication systems and leading to link outage and even severe consequences for pedestrian life. Our approach involves a module that was proposed for recognising pedestrian walking behaviour when servers are unavailable. Experimental results confirmed the efficacy of the proposed approach related to previous methods.

Conclusion

This study proposed a novel approach for the classification of pedestrian walking behaviours in wireless communication and network failure contexts. A methodology for establishing the proposed approach for pedestrian safety was presented in five phases. The key steps of this methodology included the requirement preparation phase, identification phase, phase for the identification of all requirements for data gathering, the pre-processing phase, the classification and development phase and the validation and evaluation phase. The irregular walking behaviours of mobile phone users whilst walking on the street were explored using a questionnaire about mobile usage. The pedestrian walking behaviours were classified using three experiments in ML based on two classifiers, namely, random forest and decision tree with multiple features, and the performance of the classification was then validated. Four standard vectors for walking behaviour recognition were developed, and the performance of this development was evaluated using multiple scenarios and features. The development approach was used to differentiate positive or normal walking from aggressive pedestrian behaviours (i.e. running, chatting or texting and talking on mobile phone whilst walking on the street). The three phases of the methodology yielded practical results. (1) Amongst the 262 sampled respondents, 66.80% and 48.10% used mobile phones for calling and chatting, respectively. These high percentages should be considered. (2) The analysed behaviours of the 263 sampled participants could be adopted to represent the possible features of pedestrian walking signals. (3) The precision of each class was 100% in the classification process based on decision tree classifier in ML. (4) The four standard vectors used in this work could recognise pedestrian walking regardless of type (i.e. aggressive walking or normal walking). The percentages of the classification precision for normal walking and calling were 70%, whereas those for chatting and running were 100% and 80%, respectively.

In future research, an alerting application can be installed on smartphones to warn drivers of pedestrian behaviour (i.e., aggressive or normal behaviour) even when a server is unavailable. Furthermore, additional features on pedestrian behaviour recognition can be investigated in future research. Future work may also explore the following:

  1. The proposed scenarios can be applied to different data types of pedestrians, such as pregnant women, sickly and disoriented people and children below 20 years old.

  2. Additional data can be collected to increase the reliability and efficiency of the proposed approach.

  3. Mobile applications for pedestrian safety can be programmed on the basis of the proposed approach for data exchange in wireless communication and network failure contexts between vehicles and pedestrians in future research.

  4. Other scenarios for the collection of pedestrian behaviour, such as pedestrian walking behaviour, whilst video calling, can be added.