Modelling vehicular interactions for heterogeneous traffic flow using cellular automata with position preference

This paper proposes and validates a modified cellular automata model for determining interaction rate (i.e. number of car-following/overtaking instances) using traffic flow data measured in the field. The proposed model considers lateral position preference by each vehicle type and introduces a position preference parameter β in the model which facilitates gradual drifting towards preferred position on road, even if the gap in front is sufficient. Additionally, the model also improves upon the conventional model by calculating safe front and back gap dynamically based on speed and deceleration properties of leader and follower vehicles. Sensitivity analysis was carried out to determine the effect of β on vehicular interactions and the model was calibrated and validated using interaction rates observed in the field. Paired tests were conducted to determine the validity of the model in determining interaction rates. Results of the simulations show that there is a parabolic relationship between area occupancy and interaction rate of different vehicle types. The model performed satisfactorily as the simulated interaction rate between different vehicle types were found to be statistically similar to those observed in field. Also, as expected, the interaction rate between light motor vehicles (LMVs) and heavy motor vehicles (HMVs) were found to be higher than that between LMVs and three wheelers because LMVs and HMVs share the same lane. This could not be done using conventional CA models as lateral movement rules were dictated by only speeds and gaps. So, in conventional models, the vehicles would end up in positions which are not realistic. The position preference parameter introduced in this model motivates vehicles to stay in their preferred positions. This study demonstrates the use of interaction rate as a measure to validate microscopic traffic flow models.


Introduction
One of the problems encountered in traffic safety analysis is that it is difficult to obtain reliable exposure between different vehicle types such as trucks, buses, cars, two wheelers (2Ws) and three wheelers (3Ws). In one of the few studies dealing with this issue, Nationwide Personal Travel Survey data were used to estimate vehicle miles driven as measure of exposure [1]. Bhalla et al. [2] estimated exposure between pair of vehicles as the product of number of vehicles of that type and average vehicle miles travelled by that vehicle. Overall exposure of vehicles can be estimated from origin-destination surveys or household surveys, but these are not easily available. However, these exposure estimates do not tell us much about the actual interaction between vehicles on the road. Attempts to correlate total distance travelled (exposure) to vehicular interaction would be inaccurate as vehicular interactions significantly depend on vehicle positioning pattern in addition to density and composition. Recently, some studies have tried to understand microscopic vehicular interactions that contribute to accidents [3][4][5]. Oh et al. [4] suggested that exposure is equal to the total time a vehicle pair spend in following. Their approach allows for more precise measurement of exposure between the two vehicle types on a given road. The simulation model was calibrated and validated using traffic characteristics such as volume, density, speeds, occupancy time, etc., and then used to predict the exposure. This approach of using microscopic traffic flow simulation can be further explored to determine the exposure or interaction rate between different types of vehicles using the field data.
Over the years, various microscopic traffic flow models have been developed to predict vehicular behaviour from a mid-block section of road to the network level. In microscopic traffic flow models, each vehicle is described by its own equation of motion; hence, the computational time and memory required are greater for these models. In this context, Cellular Automaton (CA) modelling has been found promising to meet this challenge. The concept of microscopic traffic flow CA model was first coined by Cremer and Ludwig [6]. Their study was followed by Nagel and Schreckenberg [7], whose model was found to be superior in modelling randomisation in traffic flow. This model was very basic as it had only four rules that governed the movement of vehicles in a stream. These rules were acceleration, deceleration, randomisation and repositioning of vehicle based on new speed. This model even in its basic form was able to replicate some of the traffic features of homogeneous, single lane road with periodic boundary conditions. Most of the CA-based traffic flow models have addressed the homogeneous traffic flow and its behaviour. Due to the discreteness of this model, it provides an opportunity to simulate large-scale real time microscopic phenomena like platoon formation and the capacity drop at transitions between free and congested flow. Later, others [8][9][10] contributed to the development of the model by adding more rule sets to increase its capability to replicate traffic features seen in multilane and heterogeneous traffic. Matthew et al. [11] proposed a modified cell size, randomisation rule and lane-changing rule of CA model for heterogeneous conditions. Mallikarjuna and Rao [12] (Ma-Ra) developed a heterogeneous traffic model for Indian conditions. They found that traffic in India is highly heterogeneous with frequent lane changing; hence, it was necessary to modify the Knospe's model to incorporate many types of vehicles and also their lateral movements. Traffic composition in India includes a significant proportion of motorised 2Ws and motorised 3Ws that have smaller dimensions than cars. Mallikarjuna and Rao reduced cell dimensions to 0.5-m in length and 1.75-m in width to represent different lengths and speed differentials in each time-step for various vehicle types. Lateral and longitudinal movement rules were also improved from earlier models. Zhao et al. [13] developed a CA model for determining interactions between motorised and non-motorised vehicles near a bus stop. This model incorporated non-lane based behaviour of non-motorised vehicles. Vasic and Ruskin have developed a CA model to simulate the road network structure [14]. This was done to simulate car and bicycle traffic on mid-blocks and at intersections. Xie et al. [15] developed a CA model for modelling interactions between vehicles and pedestrians at signalised crosswalk. Their results showed that there was a critical value that divides the vehicle flow into free and congested flow portions. Zhang et al. [16] compared CA and gas dynamics models using speed density characteristics of the mixed bicycle traffic (i.e. bicycle traffic including electric bicycles). They found that the results produced from CA were more consistent with the observed data when density was lower, while gas dynamics model performed better at densities higher than 0.3 bicycles/m 2 . Tao et al. [17] proposed an improved brake light CA model by improving acceleration rules and to avoid over-deceleration, the randomisation probability and deceleration extent are determined according to the results of the step of deterministic deceleration. Xie and Zhao [18] developed a CA model that considered timid and aggressive driving behaviours. They modified the anticipated speed of the preceding vehicle with a new constant parameter, Dv representing the aggressiveness of driver. Further, they found that even a small proportion of timid drivers significantly reduce the road capacity while it needs much more aggressive drivers to increase road capacity. Zheng [19] exhaustively reviewed the lane changing models available for microscopic traffic flow modelling. This study suggested that a comprehensive model that captures lane changing decisions needs to be developed and that this model should be able to predict vehicle trajectories close to that observed in field at microscopic level. At macroscopic level, the model should be able to produce fundamental traffic flow characteristics. Until now, researchers have evaluated CA models microscopically using individual vehicle trajectories [20] or macroscopically using traffic characteristics such as stream speed, average flow, density or occupancy and number of overtaking instances [8,10,12]. Pandey et al. [21] evaluated the CA model proposed by Mallikarjuna and Rao and found that even though the CA model could simulate fundamental diagrams (flow vs density plots) satisfactorily, it gave unexpected results when microscopic characteristics such as lane-maintaining behaviour, car-following and overtaking manoeuvres were compared with those observed in field. Also, the mean stream speed and capacity of road were higher than that observed on road. Authors believe that this was due to inadequacy that arises in not considering the heterogeneous nature of vehicles and difference in driving behaviour. Some of the inadequacies are listed below:

Inadequacies in lateral movement rules
In the Ma-Ra model, overtaking vehicles are supposed to meet two criteria: (a) incentive criterion and (b) safety criterion.
(a) The incentive criterion requires the vehicle to have longitudinal gap in target lane to be greater than the current speed multiplied by a factor. That means the vehicular lane change behaviour was purely based on availability of longitudinal gaps on road and that vehicles had no preference or desired position on road. This may not be true as it was found in this study that vehicles do have a preferred position on the road [21]. For example, heavy vehicles such as trucks and buses [henceforth referred to as heavy motor vehicles (HMVs)] tend to drive closer to median (henceforth referred to as 'median lane' in this paper) while lighter vehicles such as bicycles and three wheelers prefer travelling closer to shoulder (henceforth called as 'shoulder lane' in this paper). This phenomenon plays a very important role in determining stream speeds and intervehicular interactions as heavy vehicles prefer inner most high-speed lane over outer slower lane consisting three wheelers and bicycles. This means that maximum gap and speed are not the only criteria but also convenience. This was also evident as simulated stream speeds were higher than observed due to vehicles changing lanes based on just gap size and not convenience or safety. This results in higher than expected speeds. Hence, there is a need for identifying adequacies in Ma-Ra model in terms of determination of the interaction rate between different vehicle types. (b) The safety criterion requires the incoming vehicle from back to have back gap greater than current speed multiplied by factor. It was based on the assumption that vehicles coming from the rear would never decelerate even if a vehicle would enter into their lane ahead of them. Hence, the required gap calculated by safety criterion was much higher than those observed in field [21]. This may possibly cause large and unrealistic vehicular queues at lower densities as the vehicles would rarely get enough gaps for overtaking.

Inadequacy in longitudinal movement rules
The minimum safe distance between the vehicles was considered constant, irrespective of type of vehicles involved or their current speeds. This was unrealistic as studies have found that vehicles maintain different longitudinal gaps based on the type of the leader vehicle. For example, a HMV may maintain a higher longitudinal gap, while following 3W or 2W carrying children than following vehicles without children. It is also known that the minimum safe distance depends on the current speeds and the maximum deceleration rates of both vehicles.
Literature review in this area suggests that even when risk analysis has been an area of focus for researchers working in this field, most have assumed the number of crashes between vehicle pair as a subset of total exposure between them. Exposure was expressed as vehicle kilometers or vehicle hours travelled by that vehicle during the study period. Authors believe that this exposure is not accurate as it is based on the assumption that driver behaviour is similar across vehicle types and road types, which may not be true. Hence, there is a need to develop a new method of measuring exposure or interaction between vehicles based on the number of car-following and overtaking events observed in the field. It can thus be understood that the number of car-following and overtaking events between two vehicle types can be described as an interaction rate between two vehicle types. These are better correlated to crashes than vehicle kilometers or vehicle hours travelled. In this study, it was assumed that most crashes occur during car-following or overtaking events thus a microscopic traffic flow model based on CA could be explored to simulate the interactions between vehicle pairs. This led to the development of a position preference based CA (PP-CA) model for heterogeneous traffic conditions in the present study.
The rest of the paper is organised as follows. The modified longitudinal and lateral movement rules of the proposed model are discussed in Sect. 2. Section 3 discusses data collection and extraction methodology. Subsequently, Sect. 4 includes the validation of the model using fundamental diagrams and differences in observed and simulated interaction rates. Section 5 illustrates the application of the proposed model in determining the maximum interaction rates for different vehicle pairs in given traffic conditions.
2 Position preference based CA model

Model description
Inadequacy in lateral movement rules (Sect. 1 (1a)) is addressed by introducing a position preference parameter in the incentive criteria that reduces the probability of lane change as the distance between the target position and the preferred position increases. A difference between the proposed and conventional brake light models is that conventionally it was assumed that the probability of lane change only depends on the speed of the subject vehicle and hence only one parameter a was used. This parameter captures the gap acceptance behaviour of the driver as a function of speed of vehicle. But in the proposed model, it is assumed that the probability also depends on the current position of the subject vehicle across road width. As discussed earlier, vehicles tend to have a desired or preferred position on road based on the vehicle type. As a result, vehicles often try to stick to their preferred lane even if there is a greater gap available on the adjacent non-preferred lane. This phenomenon is explained in detail in Fig. 1, which shows various interacting factors that may affect the driver's decision (i.e. subject vehicle) during a lane change instance in the model. In the study, the Ma-Ra model is used as reference model for comparison and hence the conventional model refers to the Ma-Ra model. However, both models are quite different.

Lateral movement rules
In this section, the lateral movement rules for the proposed model are presented. In Fig. 1, x t n (grey vehicle) is a subject vehicle that is trying to decide between three options. They are: Option 1 (lane change and follow the leader car x tþ1 nþ2 ), Option 2 (lane change and follow the leader 3W x tþ1 nþ3 ), and Option 3 (no lane change and keep following the leader vehicle x tþ1 nþ1 ). Option 3 is close to a do-nothing scenario as the subject vehicle keeps following the leader 3W even if 3Ws are slower than light motorised vehicles (LMVs), heavy motorised vehicles (HMVs), and 2Ws. Also, notice that Option 3 has a lower longitudinal gap (g f n ) compared to Options 1 and 2 (g tf1 n , g tf2 n ).
Here, g f n is the front gap available to the subject vehicle (nÞ after considering the anticipated movement of the leader vehicle on the current lane. In Fig. 1, g cf n is the minimum safe distance between the subject vehicle n and its leader vehicle on the current lane (Option 3), calculated using Eqs. 1 or 2, whereas g cf1 n and g cf2 n represent the minimum safe distance between the subject vehicle n and its leader vehicle for Options 1 and 2, respectively; g tb n is the total back gap available on the target lane, g cb n is the minimum safe gap on the target lane between subject and incoming vehicles (Eq. 3), and l n is the length of the subject vehicle. According to conventional CA models, Option 2 is better as it offers a larger longitudinal gap (g tf2 n ) and hence higher speeds compared to Options 1 and 3 (g tf1 n , g f n ). But, Options 2 and 3 would put the subject vehicle behind 3Ws (n þ 1, n þ 3) which have the slowest speeds and relatively higher maximum deceleration rates of the four vehicle types considered in the study. Hence, it can be assumed that the braking distance for 3Ws, which is a function of deceleration rate and speed of vehicle, would be lower than that of LMVs, HMVs and 2Ws. So, a subject vehicle following 3Ws (n þ 1, n þ 3) needs to maintain a larger gap (safe distance) as compared to those when following LMV, HMV and 2Ws. Hence, even if g tf2 n is larger than g tf1 n and g f n , the

Median Direction of Movement Truck
Three-Wheeler Two-Wheeler Car Subject Vehicle Option 1 ∆ = 10 − 10 = 0 Closest to the most preferred cell for trucks effective gap (g tf2 n À g cf2 n ) for Option 2 can be smaller than those for Options 1 and 3. Also, if the gap g tf2 n is not large enough the subject vehicle may be forced to change lane again and return to its present lane. On the other hand, Option 1 would put the subject vehicle behind car (n þ 2). If the subject vehicle is also a car, Option 1 would allow higher speeds as compared to Options 2 and 3, in spite of Option 2 offering the highest gap. Since most cars travel closer to the median lane (Option 1) [21], Option 1 is the most preferred position for the subject vehicle. Option 1 would bring the subject vehicle closer to a preferred position and Option 2 would take it away from the preferred position. Hence, if the gap g tf2 n is not large, Option 1 would appear better than Option 2. To incorporate this phenomenon an additional parameter beta ðbÞ is included to improve lane keeping profiles of vehicles. In the proposed PP-CA model, the probability of a subject vehicle making lane change to a target lane decreases with an increase in the speed of the subject vehicle and the difference between current position and preferred position of the subject vehicle represented by Dx n .Together, a and b, respectively, represent the mandatory and voluntary aspect of lane changes as observed in the field. Figure 2 shows the incentive and safety criteria used to decide if the gap in the target lane (g tf n ) is large enough to justify a lane change. The decision to change a lane is based on the current speed of subject and leader vehicles, denoted by v t n and v t nþ1 , respectively, and the difference of current and target positions from preferred position, denoted by Dx n and Dx t n , respectively. Hence, the incentive criteria can be divided into two parts: 1. The gap calculated after considering the speed and position on the target lane (to the left of incentive criterion) is larger than that calculated for the current lane (to the right of incentive criterion) 2. The speed of the subject vehicle is either zero or the speed of the leader vehicle is less than the maximum speed of the subject vehicle.
Further, safety criterion ensures that the total gap available on the target lane to be larger than the sum of the safe gap between the subject vehicle and incoming vehicle on the target lane and the length of the subject vehicle.
Thus, the proposed model is adequate for heterogeneous traffic conditions in developing countries where different vehicle types with varying maximum speeds and acceleration rates are forced to share lanes. In these conditions, it is common to observe vehicles not initiating lane change for fear of getting stuck in non-preferred lanes/position and behind slower vehicles. But the model is generally applicable to roads with slower and faster lanes as the vehicles would prefer to stay on faster lane even if a larger gap available on slower lane/position.  [21]. For example, HMVs travel closer to shoulder lanes instead of median lane. Position preference parameter allows vehicles to drift towards their preferred position on road as observed in data. This improvement in the model significantly changes the outcome of simulation as shown in subsequent sections. Also, in CA models, lane change rules can be symmetric or asymmetric with respect to lanes or vehicle type [9]. In symmetric models, both lanes are treated equally or all vehicle types have equal probability of acquiring a position on cell lattice. Asymmetric models are applicable when left/right overtaking is banned or when certain vehicle types are not allowed to acquire a certain position on road. In our model, the position parameter b ð Þ allows to switch between symmetric and asymmetric rules. At b = 0, the model is perfectly symmetric, but as the value of b ð Þ increases it becomes more and more asymmetric.  [12] for heterogeneous conditions but with some modifications. In this model, originally based on Knospe's brake light model [9], the subject vehicle would react to the 'brake light status' of the leader vehicle. If the gap in front is less than the interaction headway, it would adjust itself based on the speed of the leader vehicle. If the gap is more than the interaction headway, the subject vehicle would accelerate until it reaches a desired speed. If the gap is less than the safe gap, the subject vehicle would decelerate until a safe driving conditions is achieved. The acceleration is modified with a probability term p o when the vehicle starts from rest and p dec when the vehicle slows down. In this study, the longitudinal rules were modified such that the security gap used for calculating the effective gap for determining safe driving conditions is calculated dynamically based on speeds and maximum deceleration rates of subject and leader vehicles (Eq. 1). The longitudinal movement rules shown in Fig. 3 are explained below.

Incentive criterion
Step 1 Value of randomisation parameter determines the probability of deceleration based on headway and speed of the subject vehicle and the brake light status of the leader. If brake light is on (=1) and headway ðt h n Þ is less than the interaction headway (t s ), then the probability of deceleration p ¼ p bl . If the speed of the subject vehicle ðv t n Þ is zero, then the probability of deceleration p ¼ p o . Otherwise p ¼ p dec .
Step 2 The subject vehicle would accelerate if the braking status of the leader is off (=0) and the effective headway is larger than the interaction headway.
Step 3 The subject vehicle would decelerate if the speed obtained from acceleration rule is larger than that for a safe gap.
Step 4 The randomisation rule is applied based on the probabilities calculated in Step 1 to capture the stochastic behaviour of drivers in the field assuming that vehicles decelerate randomly.
Step 5 Subject vehicle's position is updated based on the speed obtained from Step 4.
In Fig. 3, t h n is the available time headway for the subject vehicle n, the leader vehicle is referred to as n þ 1 and the following vehicle as n À 1; t s is the interaction headway between subject and leader vehicles; v t n and v tþ1 n are the speeds of subject vehicle at time-steps t and t þ 1, respectively; v a n , v b n ; and v tþ1 n are the updated speeds of subject vehicle after applying acceleration, braking and randomisation rule, respectively; l n is the length of the subject vehicle; a n v t n ; l n À Á is the acceleration which is a function of speed and vehicle type of the subject vehicle; Step 1: Determination of the randomization parameter Step 2: Acceleration if ( = 0) and ( = 0) or ≥ then: = min( + ( , ) , ) Step 3: Braking rule = min( , ) if ( < ) then: = 1 Step 5: Car motion = + Fig. 3 Steps in longitudinal movement procedure similarly, d n l n ð Þ is the deceleration rate of subject vehicle which is a function of vehicle type; v max n is the maximum speed of the subject vehicle; x t n and x tþ1 n are longitudinal positions of the subject vehicle at time t and t þ 1, respectively; p o ; p bl and p dec are the probabilities of subject vehicle applying brake randomly based on different conditions mentioned in step 1 (i.e. determination of randomisation parameter); p lc is probability of lane change at any time-step. The values of these parameters are presented in Sect. 5.2.
Safe distance is the minimum gap a vehicle would maintain in order to avoid collision in case the front vehicle is applying brakes suddenly. b t n and b t nþ1 are the binary variables denoting brake light status of subject and leader vehicle, respectively, at time t(if equal to 1, brake light is on).
Authors observed that in staggered driving conditions, the headways can be much lower than that required for a normal deceleration process. This suggests that drivers often keep a minimum gap considering maximum deceleration capabilities of vehicles while following. Also, as the subject vehicle's speed would be limited by safe gap (braking rule), the erratic deceleration behaviour of conventional CA models is avoided. This increases the scope of the model as it can now simulate sudden braking of the leader vehicle without causing collision or unrealistic deceleration of the subject vehicle.

Safe gap calculations-following gap
While applying the longitudinal movement rules of the proposed model, the safe following distance was calculated dynamically instead of adopting a constant value. As discussed in (2) of Sect. 1, the safe distance between vehicles would depend on their vehicle types and speeds and hence adopting a constant value is not very realistic. Therefore, a safe distance between the leader and follower was assumed to be a function of velocities and deceleration rates of the two vehicles. Safe distance, shown in Eq. 1, was calculated as the difference between the distance travelled during the reaction plus braking time of the subject vehicle and the distance travelled by the leader vehicle during that time. Figure 4 illustrates car-following and explains the basis for determining the minimum safe distance g cf n . In Fig. 4, the leader vehicle n þ 1 applies brakes at time t = 0; then the subject vehicle n applies brake after a reaction time t r n . T n is the time when the subject vehicle's speed becomes zero or equal to that of the leader vehicle. In order to avoid collision at t ¼ T n , the distance covered by the subject vehicle n between t = 0 and t ¼ T n should be less than the sum of the gap between the two vehicles at t = 0 and the distance covered by the leader vehicle between t = 0 and t ¼ T n . Hence, we have  where d max n and d max nþ1 are the maximum deceleration rates of the subject vehicle n and the leader vehicle n þ 1, respectively; v t n and v t nþ1 are the speeds of the subject and leader vehicles at time t; and g n is the gap available for the subject vehicle n.
When g n ¼ g cf n , the safe distance required by the subject vehicle while following is A negative value for g cf n represents that the distance covered by the leader vehicle is larger than that by the subject vehicle and hence a collision would never happen as the subject vehicle would not be able to reach the leader vehicle. Thus, in this case, the minimum safe distance would be maintained when both vehicles travel an equal distance. Hence, in Eq. 1, assuming the second term is equal to the third term, the safe distance required by the subject vehicle would be

Safe gap calculations-back gap
Inadequacy in lateral movement rule (Sect. 1 (1b)) is addressed by determining the back gap distance dynamically using vehicular deceleration rates and current speeds (explained later). In the proposed model, it is assumed that while making a lane change, the subject vehicle only looks for a safe stopping distance between itself and the incoming vehicle from the rear on the target lane, which is denoted by g cb n . Conventional brake light models require this distance to be equal to a factor (a) multiplied by the speed of the incoming vehicle. This means they ignore the fact that the incoming vehicle would decelerate in the following time-steps upon seeing the subject vehicle entering the lane ahead of them. They also ignore the speed of the subject vehicle attempting a lane change in calculating the safe distance. This leads to a very conservative lane-changing model, especially for India, where lanechanging behaviour is assumed to be much more aggressive. Safe back gap g cb n , which is the gap between the subject vehicle (attempting lane change) and the incoming vehicle on the target lane, is calculated considering distances covered by the two vehicles, shown in Fig. 4. Here, unlike the minimum following distance g cf n , where both vehicles decelerate, the incoming vehicle n À 1 decelerates while the subject vehicle n accelerates or maintains its current speed on the target lane. Hence, the braking distance for the subject vehicle in Eq. 1 is replaced by the total distance covered by the subject vehicle assuming it maintains its current speed on the target lane. The assumption that the subject vehicle maintains its current speed on target lane would always give a safer back gap compared to that determined based on the assumption that the vehicle accelerates on target lane. Hence, replacing the distance covered by leader vehicle with the distance covered by subject vehicle in Eq. 1 results in the following equation: where t r nÀ1 is the reaction time of the incoming vehicle, d max nÀ1 is the maximum deceleration rate of the incoming vehicle from back on target lane, v t n and v t nÀ1 are the speeds of the subject and incoming vehicles, respectively, at time t. For negative g cb n , its value is taken as the same as in Eq. 2. Note, g cf n and g cb n are based on continuous equations and then discretised. As the cell length is 0.5 m, which is quite small compared to other CA model, some accuracy loss during discretisation (rounding to nearing 0.5 value) would not affect the model performance. The following sections present the data collection effort and the implementation and validation of the PP-CA model.

Data collection
In Ludhiana city, Punjab, India, eight arterial roads, namely (1) Chima Intersection-Samrala Intersection, (2) Chima Intersection-Vishwakarma Intersection, (3) Jagraon Bridge-Jalandhar Bypass, (4) Bharatnagar Intersection-Jagraon Bridge, (5) Bharatnagar Chowk-Model Gram, (6) Bhaiwala Chowk-Shastri Nagar, (7) Ludhiana Bypass and (8) Kundan Vidya Mandir Lane, were selected for this traffic survey. These roads were selected because of the availability of vantage points for mounting cameras and variations in flow among them. A total of 16 h of traffic surveys were conducted using video-camera during peak (09:00-10:00) and off-peak hours (12:00-13:00). Pedestrian foot-over bridges were used to mount cameras as the locations provided a view of a clear road stretch of 80 m. The perspective of this road from the camera also suited the image processing software used for data extraction (TRAZER TM ). A rectangular trap of 60 m 9 7 m on the road was delineated in the beginning to facilitate software calibration. Vehicles' trajectories were drawn in TRAZER TM , a video image processing software developed by Kritikal Solutions Limited, India (www. kritikalsolutions.com). Due to the software limitations, all trajectories were manually marked to ensure accuracy. Figure 5 shows the marked vehicles in TRAZER TM . The objectives of the study required accurate speed and gap determination, and hence the accuracy of trajectories was more important than the number of trajectories. Since each hour of recording contained thousands of vehicles, this was assumed to be enough for determining speeds and gaps statistically. Hence, it was decided to collect 2 h data on each arterial in the first round and then collect more videos for roads with high variability (if it existed). A regression towards mean approach was used to determine the adequate sample size on each road. More data would have been unnecessary and expensive as each hour of data requires more than a week if done manually and accurately. A total of 4,983 vehicle trajectories containing frame-wise x-y coordinates of each vehicle on every 25th frame was created to derive traffic flow characteristics. A flow chart explaining the steps for data extraction is presented below.

Flow chart for extracting microscopic characteristics
A MATLAB program was developed to extract traffic characteristics such as individual vehicle speeds and gaps, acceleration/deceleration, density, flow and total exposure between vehicle types. Figure 6 shows a flow chart showing the algorithm. A detailed outline of the flow chart is presented below.
1. Input consists of x-y coordinates of vehicles, frame IDs, vehicle IDs and vehicle type.    (c) Apply the moving average smoothing technique to each trajectory. (d) Calculate speed using the first and the last ycoordinates in the trajectory and corresponding frame ID. Each frame is 1 s apart from the previous one. (e) Calculate acceleration and deceleration using speed differential for each trajectory.
8. Calculate area occupancy as the ratio of the total projection area occupied by all vehicles in a frame array to the total road trap area. 9. Calculate lateral and longitudinal gaps for vehicles that overlaps along length and width, respectively, in a particular frame. 10. Determine the interaction rate between different vehicle types by measuring the number and types of vehicles involved in overtaking and car-following based on their speeds and whether or not they have lateral and longitudinal gaps. In this study, the interaction rate between two vehicle types is defined as the number of vehicles (say type A) found to be following or overtaking (say type B) per 1,000 observed vehicles. A vehicle was considered to be following if its path had at least 50% overlap with that of the preceding vehicle along the direction of movement (Fig. 7). This was determined from their x-y coordinates provided by TRAZER. Since the camera could only focus on 60 m of road length in front of it, a vehicle having a longitudinal gap of more than 60 m was not considered to be following. Similarly, a vehicle was considered to be overtaking only if it had a lateral overlap ([50%) with an adjacent vehicle and its speed was more than that of adjacent vehicles. 11. The output was stored in three dimensional arrays for further processing.

Simulation setup
A MATLAB code was written to simulate a 7-m-wide twolane road of length 5,000 m and four vehicle types namely HMVs, LMVs, 3Ws and 2Ws. As mentioned earlier, in CA models, the road is represented by a lattice of uniformly sized cells and the size of cells affects the computational time and accuracy of the model. Finer cell sizes result in higher accuracy as vehicular gaps and dimensions can be represented more accurately but it is believed that this also increases computational time as there are now more cells in lattice that needs to needs to be processed by computer at every time-step. However, authors believe that computational time is more dependent on the density of vehicles on road, number of lanes and length of road to be simulated, whereas cell size has less effect on computational time. Hence, it was decided to adopt a size that can accurately represent the smallest vehicular gap and dimension observed in study. Since, in mixed traffic, the space headway could be as little as 0.5 m during queues in jam conditions and 2Ws are the smallest vehicles in the study with a maximum width of 0.7 m, a lattice with a periodic boundary condition consisting of 10,000 cells of size 0.5 m 9 0.7 m was used for simulation. The open boundaries are usually not preferred as longer lattices are required for various simulation phases. Further, as the length of lattice increases, the number of vehicles to be processed at each time-step also increases for a given density. This increases computational time and still does not guarantee a steady state before measurements. All road links were simulated using their traffic composition and densities per kilometer as input. Measurements were taken through a virtual detector of length 60 m in the middle of lattice to replicate 60 m camera trap used in field and then results were averaged at different occupancies. A total of 10 simulation runs were carried out with each run simulating for 3,600 s using a resolution of 8 time-steps per second. Hence, there were a total 28,800 (3,600 9 8) timesteps in each simulation run. For each simulation run, the first 800 time-steps were discarded as a warm-up to eliminate the initial noise. The simulated vehicular trajectories were created to obtain characteristics such as individual speeds, stream speeds, gaps, occupancies, proportion of a vehicle type in car-following or overtaking. Vehicles do not necessarily continue in the same lane, so in this study car-following does not only represent vehicles in perfect car-following, but also those having staggered car-following [22] with some degree of lateral overlapping ([50%) in the direction of travel as shown in Fig. 7. Since all eight roads had the similar geometry, results can be attributed to traffic flow characteristics.

Validation of the model for fundamental diagrams
Simulations were carried out at different area occupancies (q a ) to evaluate the proposed model with and without lane preference rule. Figures 8 and 10 show graphs of flow (q) and stream speed (v), respectively, with and without position preference rule at different occupancies, b = 0 and safe/back gap calculated dynamically. Here, PCUs represent passenger car units. In Fig. 8, the fundamental diagram (q-q a plots) shows the expected parabolic shape with the highest flow near-area occupancy value of 0.16 and 0.175 for cases without and with position preference rules, respectively. The capacity was found to be higher without preference rule due to the liberty of vehicles to choose any position and thus utilise the road space optimally, leading to the overestimation of flows at a given occupancy level. Thus, the proposed PP-CA model reproduces more realistic capacities than the conventional CA model. Figure 9 shows the fundamental diagram (q-q a plots) of observed and simulated values obtained using the PP-CA model. The simulated values are averaged for area occupancies and simulation runs. The speed-occupancy plot in Fig. 10 shows the common trend with some noise around the occupancy of 0.1 due to transition from free flow to congested state. This is also evident from the q-q a plot in Fig. 8. Figures 11 and 12 show plots of flow (q) and stream speed (v) against area occupancy, respectively, for only car scenario for different values of b with safe/back gap calculated dynamically. It can be seen that as the value of b increases from 0 to 10, the capacity and stream speed on the road decrease. This is because at higher values of b, the

Validation of the model for interaction rate estimation
The aim of this study was to develop a PP-CA model to determine interactions between vehicle types for the given traffic flow conditions. Hence, this model was first validated using visual inspection and fundamental diagram (qq a ) and then by comparing pair-wise simulated and observed interactions between vehicles types. For estimating interactions, whenever a vehicle was found to be following or overtaking another vehicle, it was considered to be interacting with that vehicle. Therefore, a vehicle in measurement region can have 0, 1, 2 or 3 interactions based on the type of vehicles and the minimum gaps to the sides and front. If one vehicle is followed by another and at Here, vehicles with gap more than 60 m were not assumed to be following. Similarly, the vehicles with longitudinal overlaps were considered as overtaking or overtaken. Video-graphic data were collected on eight locations as discussed earlier. Since the objective of this paper was to develop a microscopic model that can simulate interaction rates between different vehicle types, the interaction rates were determined for eight vehicle pairs, as shown in Figs. 15 and 16. Eight locations generated a total of 64 data points. Along with these interaction rates, other traffic characteristics such as traffic composition, area occupancy, vehicle-wise maximum speed and mean lateral position on road were also measured for calibration of the model. The simulations were carried out at different area occupancies. Table 1 shows the parameters used in the model for simulation. The observed and simulated interaction rates for different vehicle pair at different locations were compared using paired sample tests. Kleijnen [23] suggested that the student t test can be used to verify that the expected values of x i and u i are equal. Then, t-statistic becomes where d is the average of n s pairs of differences in x i and u i , d is the expected value of d, s d is the estimated standard deviation of d, and n s is the sample size. Since the measured 64 data points were not enough for assuming a Gaussian distribution, Wilcoxon's signed-rank test with continuity correction was used instead of paired t test. This resulted in a p-value of 0.19. Table 2 shows the results of Wilcoxon's signed-rank test and Pearson's correlation coefficient between observed and simulated interaction rates, where V is the sum of ranks of positive difference in the paired data and r is the Pearson's product moment correlation. We can see that the observed and simulated medians are not significantly different.
In Table 1, the values of acceleration, p o , p dec , p bl , p lc and interaction headway are adopted from previous study [24], whereas values of other parameters were observed by authors. Then, validation was carried out by checking for positive correlation between the average simulated interaction rate I s i À Á and the expected value of the observed interaction rate I o i À Á . As suggested by Kleijnen, it is important that the simulated mean increases if the observed mean increases on any road. Hence, Pearson's correlation coefficient was calculated assuming that in a perfect model the relationship between simulated and observed interaction rates would be linear was found to be 0.75.
It was clear that there is a medium to high correlation between observed and simulated means of interactions rates. This means that as observed interaction rate increases for any vehicle pair and traffic condition the simulated interaction rate also increases.

Vehicular interaction rate simulation
This paper further explores the effect of area occupancy (q a ) of road on the amount of interaction between heavy and light vehicle types. Simulations were carried out at different area occupancies keeping vehicular proportion equal for different vehicle types, and interaction rates were measured for LMVs and other vehicle types. Similarly, interaction rates were also measured between HMVs and Here the interaction rate between two vehicle types was measured as the number of vehicles of type A found to be interacting (following or overtaking) with type B for every 1,000 simulated vehicles of type A found in measurement area. For example, interaction rate for LMV-HMV would mean the number of LMVs found following or overtaking HMVs out of 1,000 LMVs observed in measurement area. From Figs. 15 and 16, one can see that the interaction rates increase rapidly with area occupancy in the beginning and then decrease after a certain point. This is plausible as at lower occupancies vehicles engage in both car-following and overtaking instances while at higher occupancies the number of overtaking instances reduces. The number of car-following instances was not linearly related to area occupancy. It can be assumed that car-following increases in the beginning owing to the increase in area occupancy on road. But as a particular vehicle can only be followed by at the most two vehicles at a time in no-lane discipline condition, the increase in car-following would not be as significant at higher densities. This is not the case with overtaking manoeuvres as overtaking possibility would cease to exist at very high density. It was found that LMVs had higher interaction rates with LMVs and HMVs because they share the same lane, but had very low interaction with 3Ws as 3Ws travel farthest from the median lane. It was also found that 3Ws had the lowest interaction rate with heavier vehicles such as HMVs and LMVs. This could be one of the reasons for low fatality rate among 3Ws as observed. HMV-HMV interaction rates were not analysed, as there were not many HMV-HMV interactions found in field data and hence cannot be validated.

Conclusions
This paper attempts at extending the brake light model to heterogeneous driving behaviour by proposing more realistic longitudinal and lateral movement rules. It considers the effect of position preference of different vehicle types and proposes a modified CA model, position preference based CA (PP-CA) model. This model also attempts to replicate the driving pattern observed on urban arterials in Ludhiana. The average lateral positions obtained by the  proposed models were more consistent with the observed values than those obtained by Ma-Ra model. The new position preference parameter (b) was found to have a significant effect on flow and stream speed. This paper also demonstrates the use of interaction rate between vehicle pair as an alternate method to validate the microscopic traffic flow models which, otherwise, used to be validated through fundamental diagrams or individual vehicle trajectories. The interaction rate between vehicles plays a significant role in determining the crash propensity on that road. Vehicles that share the same lane have relatively higher interactions and hence higher crash propensity than those that are segregated by barriers, median or divider.
The results of simulation showed that there was a significant relationship between area occupancy, vehicular proportion and interaction rate between them. The higher interaction rates between vehicle pairs namely HMV-2W and LMV-2W may be the cause of higher fatal crashes between these two pairs as commonly observed in developing countries. This study thus provides fresh impetus to the risk analysis modelling using microscopic traffic flow models.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http:// creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.