Computation of Optimal Spacing and Density of Bus Rapid Transit Stations Using Evolutionary Algorithms

In this research, a feasible mechanism is developed to determine the optimum number of bus rapid transit (BRT) stations as well as their respective locations along the service corridor. To accomplish this, a mathematical model is developed and optimized by using three different evolutionary algorithms, namely particle swarm optimization (PSO), genetic algorithm (GA), and differential evolution (DE), and the results are compared. The total cost function is composed of two main costs namely the operator’s cost, i.e., related to costs on service provider’s end, and the user’s cost, i.e., related to costs on commuters’ end. A functional numerical example with the commuters’ demand is worked out by minimizing the cost function, which demonstrates the applicability of the framework. In our case study, PSO outclassed GA and DE on the basis of convergence rate. Since our work has proved the robustness of PSO as compared to GA and DE, we conducted our sensitivity analysis keeping PSO as our benchmark algorithm to study the influence of various parameters on the optimal cost. The computational experiments reveal that the optimal cost is substantially affected by the variations in the commuters’ demand, commuters’ walking speed, and value of the users’ access and in-vehicle time. On the contrary, the acceleration/deceleration delays at a bus station, bus operating cost, and headway have an inconsiderable impact on the optimal cost.


Introduction
The bus rapid transit (BRT) system also known as Transitway or Metro is being used in different parts of the world as a public transportation system. It is reliable in terms of its capacity, time efficiency, and affordability, as compared to that of the conventional bus system. For several years, the urban rail transit system has remained the only system to efficiently tackle the emerging transport problems in high urban areas; however, the BRT systems have outshined it in terms of construction, operation and maintenance cost of the system, the time required to develop the system, greater flexibility, and the comparatively high throughput capacity [1]. Another study is conducted by developing a model to contrast BRT, light rail, and heavy rail system, and it suggested that the high standard BRT is the most cost-efficient system in terms of users' and operators' cost, while the rest can only be superior if the operating speed kept faster than the BRT [2]. Thus, for the ever-increasing urban population, the BRT system appears to be a quite useful mode of transport in the future's aspect due to its merits over other modes of transport.
A very important and key component in the designing phase of the BRT system is the determination of the density and the locations of the stations, along with the intended route, where the facility has to be provided. The density and the locations of the BRT stations play a crucial part in determining the accessibility of the BRT system, whereas the accessibility factor is very important to be considered, as the efficiency and the Level of Service (LOS) of the BRT system primarily depends on the accessibility [3]. Accessibility for the passengers can be improved by increasing the number of stations; however, it would possibly increase the delays due to the excessive accelerations and decelerations at all stations. These additional delays would eventually affect the total in-vehicle time as well as the operator's cost due to the requirement of additional buses to serve every station [4].
Thus, the objective of this study is to provide with a complete framework that is useful in determining the density and locations of the BRT stations by optimizing the users' and operator's cost. The user's cost composed of users' in-vehicle cost is mainly depending on the time users spend in the vehicle and the value associated with that time. To minimize this cost, there should be a minimum, collective in-vehicle time spent by the users. Similarly, the users' access cost is also a part of the user's cost and it depends on the time users required to reach the BRT system and the value associated with that time. This time should be as minimum as possible in order to minimize the users' access cost. Lastly, the operator's cost fundamentally depends on the fleet size that should involve the minimum optimal number of buses to avoid the soaring in the total cost. Similar work has been carried out by Chien and Qin [3], who developed and optimized the mathematical model using the genetic algorithm (GA). The GA can optimize both continuous and discrete functions as well as multiobjective problems. However, besides advantages, one major disadvantage of GA is that an unfavorable choice of GA parameters might affect the outcome of the application. The choice of vital GA parameters such as crossover and rate of mutation as well as selection criteria of the new population should be carried out carefully. Any inappropriate choice of these parameters will make it difficult for GA to converge or it will simply produce meaningless outcomes [5,6]. Inspired from the work done by Chien and Qin [3], in our paper, we have modified the total cost function and determined the optimal BRT stations' density and locations along the service corridor. The BRT operating in-vehicle cost is modified and the BRT personal and maintenance cost is embedded into it. Addition of a new station on BRT service corridor may considerably affect the personal and administrative cost and influence the cost at operator's end. Similarly, number of stations on the service corridor causes buses to accelerate and decelerate, which has influence on the maintenance of bus. Moreover, we have also used three different evolutionary algorithms (EA) which are particle swarm optimization (PSO), differential evolution (DE), and GA for the optimization of total cost function. In contrast to the previous study, a comprehensive sensitivity analysis is done for in-depth analysis of various independent variables used in the total cost function, also, the framework provides with the single optimal solution against the particular number of stations as compared to the multiple combinations presented in the previous study. Consequently, these advancements result in developing the new framework that is simple, efficient, and scientifically reliable in policy making.
EAs which have been used for optimization problems now became very famous among most of the researchers in the recent past decades. These algorithms have been applied to many engineering and applied sciences fields for finding out the optimized solutions due to their simple framework and applicability. Amidst the various techniques proposed, the three algorithms that are very similar and famous are DE, GA, and PSO. As the genetic algorithm is most conformist and well established among the other due to its earlier introduction [7], relatively recent DE and PSO became widely known due to their ability for finding out the optimized solutions for continuous and discrete problems. GA is a population-based meta-heuristic optimization technique which employs a mechanism that is inspired by the living organisms like the processes of selection, reproduction, and mutation, which works on the principle of "survival of the fittest" [8].
Almost all the EAs share the relatively same processes and operations. There are three main operation and processes in all the EAs, where the first step is the initialization in which the randomly generated initial population of individuals is produced with respect to some specified solution representation, as solution representation is the most critical part in any evolutionary algorithm that in which way it should be represented. It should be represented with respect to the elements of some specific evolutionary method as it is responsible for producing the feasible solution. If the solution representation is not done in a good way, then it may lead to an infeasible solution. Besides solution representation, two more critical parameters must be taken into account before getting into the program (i.e., size of the population and the maximum number of iterations), It should be determined initially, as these two parameters have a great influence on solution's time and quality. After the initialization, the next step is to evaluate each solution present in the population with respect to the fitness values. The purpose of evaluating the solution is to find out the average fitness population or to rank each solution in the population for the sake of selection. At last, the generation of the new population is created using perturbation of solutions in the current population. Figure 1 shows these three processes of the EA. For a novel comprehensive discussion of EA, see [9,10].
To illustrate the EA operations as shown in Fig. 1, the steps for basic EA are listed below: 1. Generating the initial population. 2. Evaluating individual fitness values 3. Check that either all the stopping criteria met or not? If yes, then the program will be stopped and the required solution is accomplished. If it is no, then it will go to the next step.
4. Generating the new population and go up for evaluating the fitness values. This loop will continue to run until all the required stopping criterion is met.
Many researchers have successfully employed EAs in the field of transportation engineering, Saeidi [11] worked on improving the efficiency of the public transportation system by optimizing the routes using the GA, Koh [12] used DE in the field of transportation planning for solving bi-level programming problems. Wu et al. [13] comprehensively reviewed the applications of PSO in the railway engineering domain. It was concluded that the PSO has seen everincreasing applications in the field of railway engineering, whereas network layout planning, scheduling, and active controls are the main research areas. Moreover, many researchers claimed that PSO is a viable method and its computational efficiency is better than GA and DE [14][15][16][17]. Also, there is a cluster of studies that used PSO in solving the BRT route optimization problem and the Transportation Network Design Problems (TNDP) in high-density urban areas [18][19][20][21]. Many studies focused on the bus stations' spacing in urban areas. Saka [22] proposed a model to optimize the bus stations' spacing and concluded that right spacing of the bus stations can cause a reduction in the headway and travel time which would ultimately create an impact in lowering the operator's cost. Moreover, the optimized spacing of the bus stations results in a smaller number of buses in the fleet which reduces the operator's cost. Moura et al. [23] presented a two-stage model to find the optimal locations of the bus stations. At the first stage, the associated cost with the system was minimized by strategically locating the bus stations, while the second stage maximized the operating speed by tactically locating the bus stations. Thus, concluded that the locations of the bus station influence the commercial operating speed of the buses. Ceder et al. [24] designed a model to optimize the bus stop locations in a given route for uneven topography. Some other studies also carried out particularly focusing on urban areas [25,26].
Besides presenting the models and solutions to the station spacing problem, numerous researchers also studied the relationship between stations' spacing and influencing parameters, for instance, headway and speed. Chien et al. [27] specifically studied the relationship between bus headway and the spacing of the bus station. They developed the mathematical model and have taken the travel time of users as an objective function. It is also important to note that a realistic wait time was considered during the study. It was observed that there is an inverse relationship between the fleet size and the optimal headway & stations' spacing, while there is a direct relationship between passenger demand and travel time. Chang and Schonfeld [28] also worked on the optimization of headway in their study and developed the analytic model for optimizing the BRT. Apart from headway, Tirachini [29] conducted a comprehensive study on the relationship between the bus operating speed and bus stations' spacing. He concluded that if the bus operating speed is constant and the passenger's demand is increasing then there should be decreased spacing among the bus stations, while on the other hand if the bus operating speed is increasing then the bus stations should be spaced at large distances despite the rise in demand.
There are several studies where the researchers efficiently used various tools and techniques to undergo the study on optimal stations' spacing problem. Dell'Olio [30] proposed a model that can locate the bus stations and carry out the optimization of bus frequencies by taking into account the cost pertaining to the construction of the bus stations, services' provision, and operation of the bus system. Two levels are defined, the first level is about the systems' cost and the other is about the behavior of the user considering the public transport assignment model. The model reduced the access, travel, and wait times of the users, while on the operator's end, it was noted that the good services were provided with the small increase in the cost of operation. The model was also applied to a case which shows that if the practical situation criteria are to be considered then there is less cost saving as compared to considering the scientific criteria of the applied model. Overall, the model showed satisfactory results when applied to practical problems. Li and Bertini [31] considered two cost components: the first one is passengers access cost and second is the in-vehicle passenger stopping cost using the high-resolution archived stop-level data and used in the model to find the optimal spacing of bus stations. Furth and Rahbee [32] used geographic modeling and dynamic programming approach to find the optimal location and spacing of the bus stations. Furth et al. [33] have done an improved analysis of bus stations' spacing by using the Geographic Information System (GIS).
The notations to be used throughout our paper are enlisted in Table 1. The forthcoming sections are enumerated as follows: The description of the BRT service line is stated in Sect. 2, whereas Sect. 3 explains the model formulation and preparation of total cost function. Then, the concept and application of DE, GA, and PSO are explained in Sect. 4. Then, the computational experiments, i.e., the designed numerical example and sensitivity analysis are presented, and subsequently, the conclusion is summarized in the last section of this paper.

Cost Function Formulation
The development of a cost function is discussed in this section. Few assumptions are made in order to support the formulation process and the framework. Following are enumerated the assumptions that are taken into account while developing the mathematical model and framework:

Assumptions
1. There can be no more than one bus station located between the two consecutive access points, whereas if the minimum cost is coming against the number of stations that are equal to the number of access points, i.e., e = f , then all the BRT stations will be located exactly at the access points. In such a case, the users' access distance would become zero and so the users' access cost. 2. Users will always use the nearest bus station from the concerned access point; thus, the access cost can be determined as the ratio of access distance to the nearest station and the commuter walking speed.
3. Buses are not skipping any station and serving each station; hence, the acceleration and deceleration delays are assumed to be the same at all stations. Moreover, the bus headway is also assumed to be constant and the wait time is assumed to be half of the headway. Thus, the wait time does not affect the optimization and E d , the wait cost is not considered in the total cost function. 4. A fixed BRT service corridor with an elevated signalfree corridor is considered, whereas the demand can only access it from the access points. The boarding and alighting demand are distributed uniformly over a time period, but this demand can be different for each access point. 5. The commuters' demand coming from the access point k in the direction x , is known as the boarding demand in direction x at the access point k and denoted as a xk , while the commuters' demand bounded for the access point k in the direction x , is known as alighting demand in direction x at access point k denoted as d xk . Moreover, commuter demand in both directions is considered to be identical, so a 1k in direction 1 is equal to the d 2k in direction 2. This assumption is taken into consideration so that the optimal density and locations of the stations would remain similar for both service direction, otherwise, it could be neglected but the results for both directions would be different. 6. To estimate the user cost which is a nonlinear function of time, a quadratic function of time is assumed to estimate the user cost. Let's assume that, if the user travel time is t , the user cost would be t 2 multiplied by the value of time and the commuters' boarding and alighting demand.

The Total Cost Function E TC
The total cost function is the sum of all costs, and this is our objective function that is to be minimized to get the optimal density and locations of the BRT stations. The BRT system is associated with the service providers, i.e., the operator, and the service utilizers, i.e., the users, and both are an integral part of any BRT system. The operating cost is incurred by the parameters, for example, consumption of fuel, and the maintenance of the bus fleet. It is worth noting that the density and location of stations affect the operators' cost as the non-optimized number of stations can increase the acceleration and deceleration delays hence can also affect the extra fuel consumption due to additional buses required to fulfill the requirement. On the other hand, stations' spacing and locations, proportionally influence the riding and access time of users. If the number of stations increases, the users' average access time decreases and if the number of stations decreased, the user riding time decreases, whereas there is a cost associated with both vital times, in terms of the values of users' access time and riding time. Therefore, to make a trade-off between the interest of users and operators, we need to find an optimum cost that alleviates the costs associated with both parties. Thus, an optimal number of stations is found out by minimizing the objective function comprising of users' and operators' cost. Numerous researchers in the past studies have also considered the users' and operators' cost as an objective function [34][35][36]. The breakdown of the total cost function is illustrated in Fig. 3.

Operator's Cost E OC
It is the cost related to the service provider and basically the sum of all bus operation costs, i.e., bus operating in-vehicle cost, bus maintenance, and personal costs.

Bus Operating In-Vehicle Cost E O-inc
It depends on the fleet size and the operating cost of the bus. The total bus operating in-vehicle cost is basically the product of the fleet size, (C) and in-vehicle operating cost per vehicle O B , i.e., The bus fleet size can be obtained through the ratio of round-trip time to the bus headway B h . The round time of bus consists of the travel time T s along the service corridor and the layover time t l at each station. To formulate the numerical expression for fleet size, the equation derived by Chien [37] is applied. Thus: where travel time T s is function of travel distance and the bus operating speed v os . Thus, it can be calculated as: where h l and h l−1 represent the distance of station l from the first access point, and the distance of the station, previous to the station l , from the first access point, respectively. Thus, the final expression to calculate the in-vehicle operation cost is represented by Eq. (4).

BRT Personal and Maintenance Cost E O-mpc
The bus personal and maintenance cost E O-mpc constitute maintenance, repair and tire costs of buses as well as drivers and administrative cost. This cost basically depends on the round-trip distance (two times the total length of the service corridor) and the bus fleet size. The dwell time of buses at each station has an effect on personal cost, therefore, total dwell time as well as boarding and alighting demand at each station is also incorporated in bus maintenance and personal cost and can be written as Eq. (5)

User's Cost E UC
The user's cost E UC is the cost that is induced by the transit patrons using the BRT service. It is the sum of the two constituent costs, i.e., user's access and in-vehicle cost, and can be represented as E u-ac and E u-inv , respectively. (4) We use the numerical expressions developed by Chien and Qin [3] to determine the total users' cost as mentioned in Eq. (6).

Users' Access Cost E u-ac
The users' access cost is basically the expense, which is imposed on commuters using the bus transit system. The commuters who have access to the transit stations mainly incur the access cost. It is the cost associated with the time of accessing the bus stations from the access points, hence it can be obtained as the access time of the commuters, i tk , multiplied with the value of access time of commuters, V i . Numerically, it can be estimated by using Eq. (7) As per assumption 2, the commuters will go to the nearest station from the particular access point. So, the user's access time,i tk , can be calculated from Eq. (8): where s k−1 is the distance of the access point k to the first access point. The equation shows that access time is the ratio of distance to the nearest bus station from the access point k and the commuter walking speed.

Users' In-Vehicle Cost E u-inv
Users' in-vehicle cost is incurred by the time users spend in the vehicle, the value associated with that and the total It is the product of the aforementioned parameters. It consists of two costs components: If the extension is to be made in an existing BRT system then the first component, E m 1 is used. It represents the invehicle cost incurred by through flow in direction x,M x and it can be determined by using Eq. (10). Moreover, in this scenario, M x is to be determined beforehand, while for a new BRT system, its value is zero and the cost component would consist of:E m 2 only. E m 2 represents the cost incurred by the commuters that are boarding and alighting along the new route.
The total acceleration/deceleration delay, T a can be determined using Eq. (11) In Eq. (11), e represents the number of stations, and acceleration/deceleration time at a single station,w t can be calculated by using Eq. (12) derived by Chien [38]. Therefore Subsequently, for determining the second component of the total users' in-vehicle cost E m 2 , the service line is divide into three segments in terms of the service corridors, each segment is contributing its cost that is resulting in making the respective costs E m 21 , E m 22 and E m 23 as represented in Eq. (16). The entire service line with f access points can be divided into the total number of stations plus one, i.e., e + 1 service corridors. Thus, E m 21 is the round-trip users' in-vehicle cost occurring on the first segment that is composed of service corridor 1, where service corridor 1 is for serving the demand from the first access point to the first station.
E m 21 can be determined by using Eq. (17): where E m 21 is the product of the value of in-vehicle time, V m the weighted round-trip travel time on the service corridor z,2t 2 1 , and the updated or increased hourly demand that is M x + D 1 . The travel time on the service corridor z and the increased hourly demand can be determined by using Eqs. (18) and (19).
where z 1 is the distance of the first bus station to the first access point and it is equal to where the through demand for the service corridor z , D z , can be obtained by Eq. (21): where the total travel time on the service corridor z is represented by Eq. (22) where z l represents the length of the service corridor z . It is measured from the station z − 1 to station z . For all the service corridors falling under the segment two, acceleration and deceleration delays are considered as shown in Eq. (22). The third and the last segment is composed of only one service corridor that is the last: ( e + 1 ). Now, the round-trip user in-vehicle cost occurring on the third segment of the BRT service line E m 23 can be determined by Eq. (23): where D e+1 is the increased demand on the service corridor e + 1 and can be obtained by using Eq. (24): and the total travel time on the service corridor e + 1 can be calculated by Eq. (25): where S − ∑ e z=1 z l is the length of the service corridor e + 1 . For the last station (e) , the deceleration delay is not considered. Now, substituting Eqs. (17), (20) and (23) in Eq. (16) provides with the total users' in-vehicle cost as Eq. (26): As we discussed earlier, the total cost consists of users' cost and operator's cost, whereas the wait cost in the total users' cost is constant. Therefore, the total cost function can be estimated by summing up E o-inc , E o-mpc , E u-ac and E u-inv by using Eqs. (4), (5), (7) and (26), respectively, to determine the optimal BRT, bus rapid transit, stations' density and locations.
Equation (27) represents the total cost function that is the objective function in this study.

Application of Evolutionary Algorithms
The three evolutionary algorithms (i.e., DE, GA, and PSO) are applied to optimize our objective function (total cost function) to compute the optimal locations of BRT stations' and their density. The subsections of evolutionary algorithms comprise of explanation and implementations of DE, GA, and PSO. Each algorithms' implementations are discussed individually with the help of flowchart for better understanding.

Optimization by Differential Evolution
The differential evolution (DE) is a well-exploited heuristic technique that is a population-based stochastic search method for solving optimization problems over continuous space like the genetic algorithm. It was first proposed by Storn [39] for global optimization. Its framework and applicability are quite simple and easy, also it requires a smaller number of control variables which can lead someone to think that its convergence property might not be good, but it performs well in convergence. Figure 4 shows the flowchart of the DE algorithm. Differential evolution is an efficient yet powerful technique which became widely famous due to its new form of solution and used in many engineering fields as well as in many other scientific fields. A vector population-based stochastic algorithm is like GA as it also works on the three operators name as selection, crossover, and mutation, but it uses the randomly sample pairs object vectors in the process of mutation which makes it relatively recent and new algorithm when comparing other evolutionary techniques, for example, GA, it produces the randomly generated initial population, then three random individuals are selected for each individual and a new vector is produced by adding the weighed difference of two individuals. As in GA, crossover operator was one of the main operators but it is complementary in the DE algorithm. When this whole process is accomplished using all the individual, then fitness is evaluated. As a result, if the new fitness value is better than that of old fitness value, then it simply replaces the old individual with a new one. This whole process is repeated until the convergence is reached or the maximum number of generations is accomplished. DE produces a new solution by merging various solutions with the candidate or standard solution.
To illustrate the DE operations as shown in Fig. 4, the step for basic DE are listed below:  Step 1 Initialization DE algorithm generates in D dimensional search space a population of NP called individuals. In our case study, individuals are spacing of stations from the first access point to the station. In generation G population of individuals can be written as: The entire search space is covered by the initial population. The initial value of jth parameter when G = 0 is: w h e r e i = 1, 2, … , NP and j = 1, 2, … , D . The values generated randomly are uniformly distributed with the range (0, 1); x UL j , and x LL j are the upper and lower limits of jth parameter.
Step 2 Mutation For each target vector X i,G , mutation operation is performed by DE to generate a mutant vector The DE strategy used in our work is: DE∕best∕1∕bin which is best defined as: and is a constant and a real factor. For Gth generation, X best,G is the best fitness vector.
Step 3 Crossover Operation Trial vector: U i,G = u 1,G , u 2,G , … , u Di,G is generated in this operation from mutant vector V i,G and target vector X i,G . The crossover strategy is expressed as: Step 4 Selection In this operation, at each target vector f X i,G and trial vector f U i,G , objective function values are compared for smaller fitness function values. These values remain in the next generation, while others are disregarded. The strategy for selection is as follows: Step 5 Termination Mutation, crossover and selection are repeated generation by generation until the stopping criteria are reached. For the optimization problem's solution, the best vector X best,G of the current population is returned as the answer.

Optimization by Genetic Algorithm
The genetic algorithm (GA) is a well-exploited heuristic technique that mimics the natural or genetic selection. It was proposed by J. Holland's and became widely known in the coming years due to its ability for finding out the global optimal solutions effectively without any prior information.
It is an evolutionary heuristic technique which is typically population-based search method. This global optimal technique works on the principle of "survival of the fittest" and motivated by Darwin's theories of evolution [40]. It represents the solution set as chromosomes (population). Figure 5 shows the flowchart of GA. A genetic algorithm technique is an iterative process which is based on the schema theorem [41]. The solutions here represent the chromosomes which are first evaluated for the best fitness value, then the very next task is that they are ranked from the best to worst with respect to that fitness values. In GA, it mimics the natural selection to produce new solutions, this whole process can be carried out by employing the three genetic operators (i.e., selection of better chromosomes, crossover to produce new offspring, and the mutation to inject or employ the diversity), and the continuous application of these three operators can give an optimal solution. First of all, what does the selection of chromosomes mean? It means to pick better chromosomes and make them parents to generate or produce the new chromosomes (new offspring) with better fitness value along with higher probabilities rather than poorer fitness value along with lower probabilities. Fitness values are typically used as a benchmark to decide either the individuals or chromosomes are elitist or not, based on the method proposed. As it is described above that this algorithm works on the principle of "survivor of the fittest," so, the chromosomes with better fitness are selected. Now, what does crossover means in GA, once a selection phase is done, the operator of the crossover amalgamates that chromosomes to generate new offspring. As the fittest individual or chromosomes are picked often, so there is a probability that the new solution that we have achieved become nearly identical after various generations, so the diversity may drop or fall and this process can lead to a phenomenon known as population stagnation, so, as to avoid the stagnation in population, we use the mechanism name as mutation, which adds diversity to reduce the effect of population stagnation.
As the size of the population and the number of iterations are the most important factors in this algorithm, similarly, there is a need to observe a different set of decisions that must be adopted beforehand. The first set can be the selection method and the mechanism for probability assignment, which should be based on fitness. Various selection methods are used in literature, but the three very popular methods are Roulette Wheel Selection, rank selection, and elitism selection [42] Likewise, the second set can be crossover method and crossover probability, many crossover methods are used in literature, but the three very popular methods are single-point crossover, two-point crossover, and uniform crossover. The mutation method and mutation probability is the third set of decision, as it can help in balancing the diversity by adding the new chromosomes in the population.
To illustrate the GA operations as shown in Fig. 5, the steps for basic GA are listed below: Step Step 2 Initialization Step 3 Fitness Value In general, objective function has a relationship to fitness function to a certain degree. To find fitness of each solution of P(T) , we evaluate (P(T)).

Step 4 Selection Process and Crossover Operator
The parent solution set P(T) is selected to form n number of offsprings by one-point crossover operator, and the enlarged population space set P � (T) is now composed of (N + n) solutions. The roulette_wheel P � (T) operator is used to generate the next generation solution set P 1 (T) . Chromosome's selection probability of being proportional to its fitness value compared to the other chromosomes in the population is by means of this method.
Step 5 Reproduction The best individuals are selected and copied to the next generation from the current generation. This strategy is known as elitist, which means that the population is led by individuals with the best fitness value, hence, they are termed as elite individuals. By reproduction, the best solution improves from one generation to the next monotonically. Step 6 Mutation The mutation operator is used to prevent premature convergence of the population. Instead of mutating gene-to-gene, new individuals are introduced into P 1 (T) . Generation of the new mutants is random like the generation of the original population and therefore current population's genetic material is not brought in.
Step 7 Stopping Criterion This evolutionary process, when the best solution is found or when the total iterations have been run, is halted. In our case, the stopping criterion is when total iterations have been completed.

Optimization by PSO Algorithm
Particle swarm optimization (PSO) is an intelligent optimization algorithm which is based on the paradigm of swarm intelligence developed by Kennedy and Eberhart [43]. PSO is inspired by the social behavior of animals like fish and birds, in which the individuals are termed as particles and population is termed as a swarm. In a defined space, each particle is assigned a position and a velocity. All the particles then search individually for food. When a favorable location of the food is discovered by a particle, the information is then relayed to all other particles, in that space, who then rush toward that particle immediately. The particles wander around in the given space, conveying favorable locations to others, and alter their velocities and positions based on these favorable locations [44]. PSO belongs to a class of algorithms called meta-heuristics. It is similar to other evolutionary algorithms, like genetic algorithm, in a way that it uses a population bases approach to retain and refine numerous possible solutions, frequently using population attributes in conducting the exploration. Unlike GA, the evolutionary procedure in PSO only develops locations of the particles, rather than generating new particles.
The total cost function formulated in Eq. (27) is to be minimized, to find the optimal number of stations and their respective locations. The optimization algorithm is to be run for the different numbers of stations, starting from e = 1 to e = f , separately and, the particular minimized cost for each number of stations is to be found. The decision variable h l used in the total cost function E TC will determine the location of the bus stations. Figure 6 shows the flowchart of the PSO algorithm.
PSO is initialized with a variety of possible solutions as a swarm of particles. Each particle explores the new possible solutions repetitively by its motion in defined space with an altered velocity according to both the preceding finest solution of itself and the swarm. The finest solution observed by the current particle is termed as local best, while the finest solution explored by the swarm is termed as global best. All the particles are connected through global best, that is, each particle is guided by the finest solution of the whole swarm; the local best is used to consider the capacity of each particle to recall its previous individual successes [45].
Consider the dimension of the defined space as a positive integer D . The location of ith particle is denoted by , where x t ij is the location of ith particle in the jth dimensional value at the tth iteration. The rate at which the location change (velocity) is denoted by v t ij j=1,…,D , where v t ij being velocity in jth dimension of ith particle at the tth iteration. The fitness function which computes the quality for the location of the particle i . Every particle recollects its own finest location achieved so far (the location corresponding to the finest function) as P t best,i , where P t best,i is the finest location of ith at the tth . The finest location observed so far by the population is denoted as G best . During the tth iteration, velocity for jth dimension of every particle i is updated by [44] where w is the inertia weight, c 1 and c 2 are constant values multiplied by randomly generated numbers r 1 and r 2 , respectively. These random numbers are used to retain heterogeneity of the swarm and are allocated consistently in the interval [0, 1] [45]. Based on its current direction v t ij , its personal best memory P t best,i and its inclination to like the finest particle in the swarm, each particle moves. The ith particle location is updated by the addition of its preceding location and the new velocity as; Every particle i hunts in the defined space for the solution x t ij . We define this search space by defining the limits: upper limit (UL) and lower limits (LL), in which we want the desired outcome in the form of the optimum solution. In the scenario of this research, S , the length of the study segment will be the defined search space such as 0 be the LL and S be the UL. The particles will converge to the optimum solutions within this defined search space and the optimum results will be coming in the form of the optimum location of the stations along the length of the study segment. To lead the particles productively, the moving distance, in the defined space with a range [UL, LL] in any iteration, must be clamped in between the maximum range [−v max , v max ] . If we do not clamp the velocity, it would result in the rapid  Step A Derivation of Total Cost Function To begin, take e = 1 and derive the total cost function using Eq. (27). The function will have only one decision variable h l that will determine the location of a single bus station. The total number of stations e cannot exceed the total number of access points f . Therefore, the equations for all the stations: e ≤ f will be derived.
Step B Minimizing the Total Cost by PSO Minimize the total cost E w,e for the e number of stations using the PSO approach. The optimized cost function will give the minimum cost, as well as the optimal location of bus stations. The PSO algorithm is coded into the MAT-LAB programming environment for minimizing the function using the distance of BRT station from the first access point as a decision variable. The following steps show the implementation of PSO in our work.
Step 1 Initialization Define parameters such as decision variable x t ij , velocity v t ij , upper limit ( UL ) and lower limit ( LL ) of the decision variables, no of maximum iterations N , max no of particles P , define max velocity v max = 0.2 * [UL − LL] and min velocity v min = −v max , the set value of global best G best = inf and iteration counter t = 0.
For i = 1 to n : randomly generate the position of the decision variable x 0 ij in [UL, LL] with a population size of P particles.
Step 2 Updating Position and Velocity of Each Particle For i = 1 to n : generate randomly r t 1,j in [0, 1] and r t 2,j in [0, 1] and update v t+1 Step 3 Calculating Position of Each Particle For i = 1 to n : run the algorithm on the total cost function f t ij using x t ij and other given inputs in Step 1 and Step 2. Select min cost value from f t ij and its corresponding x t ij from the results generated.
Step 4 Updating of Local Bests and Global Bests For i = 1 to n : using the results from Step 3, if f t ij ≤ f best,i then update f best,i ← f t ij and P t best,i ← x t ij ; if f t ij ≤ f gbest then update f gbest ← f t ij and G best ← x t ij .
Step 5 Ending Condition Set value of t = t + 1 . If the ending condition is not met then go to Step 2. If not then G best is the best possible position of the particles with the total cost function f gbest . Extract the necessary information and stop.
Step C Finding the Optimal Solution Once, the minimized total costs for e = 1 to e = f are determined, search for the number of stations with the minimum total cost. The values of the decision variables resulting from the optimized cost function of that particular number of stations will determine the optimal locations of stations.

Computational Experiments
To practically demonstrate the applicability of the developed model, a numerical example is designed, and the supposed scenario imitates the real-life conditions. For demonstration purpose, the values of different variables are assumed.

Numerical Example
Let assume, an extension is to be made on the existing BRT route, to provide the facility to the residents of a newly developed suburb. A different number of stations can be situated along the BRT track, whereas the commuters can access the facility from five different access points. Figure 7 illustrates the distances between consecutive access points as 1.7, 2, 1.5, and 1.8 miles, respectively. Now, taking into account the key parameters, i.e., the users' in-vehicle time, users' access time, and the operator's interest, various numbers of stations can be added to the BRT system as discussed. The boarding and alighting demands at the five access points are enumerated in Table 2. The optimization algorithms are used and the optimal results are worked out.
After applying PSO, GA, and DE on our total cost function for stations one to four in MATLAB programming environment, the results of the number of iterations versus best costs, as shown in Fig. 8, are plotted. The graphs show that among these three algorithms PSO clusters rapidly and also its swarm becomes stagnant. In the case of GA, clustering is not very obvious, but the reinitialization process shows randomness being injected to improve the diversity toward an optimal solution. While in the case of DE, it can be seen that clustering is least and reinitialization has the least effect in DE. The robustness of PSO as it converges to the best optimal solution way before DE and GA can be clearly seen in the figure.
Since the scope of our work revolves around PSO and in proving its efficiency over GA, which we have set as our benchmark, therefore, in the upcoming sections, the work done is mainly through the use of PSO. Figure 9 illustrates the optimal distance and cost against the number of BRT stations. The highest achieved optimal cost is 43,085 USD/h against 1 station with the optimal location at 3.60 miles at the 7-mile-long track. On the other hand, the lowest achieved optimal cost is. 2536.3 USD/h against the four stations, with the optimal locations at 0.07, 1.76, 4.44, and 6.90 miles. Furthermore, the cost against 2 stations is 20,621 USD/h and the optimal locations are 0.91 and 5.49 miles. For 3 stations, the optimal cost is 4550.8 USD/h and the optimal locations are 0.97, 4.43, and 6.90 miles. The locations for the 5 stations would be the same as the locations of the access points due to the assumption (1), whereas the optimal cost is 14,295 USD/h. Consequently, the aforementioned density and locations of the BRT stations against the total optimal cost of 2536.3 USD/h would be considered as the optimal solution of the numerical example under consideration.
Moreover, the optimal density and locations of the stations, lead to the average access distance of 0.31 miles per commuter and, the average access time of 7.44 min, considering the commuter access speed at 2.5 mph. Additionally, the dwell time, T b and, the increased travel time due to the acceleration and deceleration of the Bus, Ta , results at 0.14 h and, 0.16 h, respectively. Consequently, the increased travel time of the Bus, in the result of serving commuter at each station is 0.6 h. Assuming the headway at 0.2 h, Eq. (15) suggests that two additional buses are required to serve the demand at the new stations of the BRT system. The optimal locations of the stations along the BRT track are shown in Fig. 10.

Sensitivity Analysis
Sensitivity analysis is the process that determines how the variations in the independent variables used in the objective function can impact the dependent variable that is the optimal cost in this study. A comprehensive sensitivity analysis is carried out in order to observe the effects of various variables used in the objective function, as derived in Eq. (27), on the optimal cost. The optimal costs against  the multiple values of different parameters are checked and the results are elaborated in this section. Figure 11 shows the optimal cost for the different numbers of stations against multiple values of users' in-vehicle time, V m . It can be seen that the optimal cost increases with the increase in V m and vice versa. Furthermore, it is important to note that for all the values of the V m , optimal cost remained minimum at the 4 stations which shows that stations. At the start with 1 station, the optimal cost against different values of V i varying a lot but as the number of stations increases, for example, station 4 and 5, there is very less difference among the optimal costs. Figure 13a, b illustrates that the optimal cost slightly decreases with the increase in the bus headway,B h , and average vehicle operating speed, v os . On the other hand, change in these parameters do not affect the optimal number of stations as the 4 stations, acquire all the minimum costs against different values.
It can be observed from Fig. 14a-c, that variation in the bus operating cost, E OC , through flow in the direction x , M x , and the acceleration and deceleration delay at a station have a negligible impact on the optimal cost. Therefore, the bus operating cost that is comprised of many factors, for example, fuel consumption, bus maintenance cost, and the wages of bus operators do not impact the optimal density or locations of the BRT stations. Moreover, the flow of commuters already bounded to the news stations from the existing stations, and the acceleration/deceleration delay also play no important role in determining the optimal density and locations of new stations. Figure 15 of the demand multipliers versus the optimal cost shows about the most striking relation of sensitivity analysis. It is worth noting that the variations in the demand can dramatically change the optimal costs. Unlike the other variables, the demand is the only variable that impacts the objective function most as both of the cost: user and operator, are directly influenced by the demand. Therefore, an increase in demand will increase the optimal cost. However, it does not affect the optimal quantity of stations as even with the variations, 4 number of stations have the minimum costs against different demands.
The walking speed of commuters also influences the optimal cost as shown in Fig. 16. Increment in the walking speed reduced the optimal cost which means that if the commuters mostly consist the young people whom walking speed is more would lead to low optimal cost as compared to the commuters consists of elderly or children. However, this variation converges with the increase in the number of stations, for example, the variation in optimal cost against the different walking speed at 4 stations is less than the variation at 3 stations and so on.

Conclusion and Recommendations
This paper has studied the problem of optimal locations and densities of BRT stations, by using PSO, GA, and DE algorithms and presented a complete framework in order to determine them. First, a mathematical model is developed by considering the realistic demand patterns. The model composed of the operator's cost and the users' cost, i.e., in-vehicle cost, access cost, while the wait cost  is considered as constant. Secondly, the model is applied to a designed numerical example with five access points and the total objective function is minimized. Thirdly, as our results show that the PSO has outclassed the GA and DE, therefore, the sensitivity analysis is carried out using only PSO algorithm to analyze the behavior of optimal cost with the variations in different independent variables used in the objective function.
The computational experiments reveal that: 1. The model can be applied to any scenario consisting of an elevated and a signal-free corridor with multiple access points, varying distances among the access points and for any boarding and alighting demand distribution. 2. The optimal density of the bus rapid transit stations along the new track or the extension in the existing track can be determined. 3. In addition to the optimal density, the optimal locations of stations minimizing the users' in-vehicle, access time, and the fleet size for catering the demand can be found out.
4. The optimized cost can be varied the most by the fluctuation in the demand, more the demand more is the optimal cost and vice versa. 5. Value of users' in-vehicle time and access time directly influences the optimal cost, while the commuters' walking speed inversely impacts the optimal cost. Moreover, variations in the bus operating speed and its headway slightly affects the optimal cost. On the other hand, bus operating cost, through flow in direction d showed no influence on the optimal cost. 6. Variation in the value of any single value can affect the optimal cost but it can never alter the density of the stations. 7. Our results showed the robustness of PSO in terms of convergence rate as compared to GA and DE. Hence, our work proved that by using PSO we can reach the optimal solution with the same population size in a fewer number of generations as compared to the other two evolutionary algorithms.