A Comprehensive Analysis of Passenger Alighting and Boarding Rates

The understanding and management of station stops continues to be a key issue in the operation of urban railways. This paper reports a statistical meta-analysis of passenger alighting and boarding rates from an expansion of a real-life worldwide data set which includes 34 different variables reflecting characteristics of passenger flow, rolling stock design, infrastructure and management actions. This has enabled the authors to identify, test hypotheses about, and quantify the impact of, previously-untested variables. A stepwise regression method using the R statistical package was proposed and developed into a more tractable model with fewer variables. This process eliminated those variables shown to provide no statistical explanation (including the presence of platform edge doors). Of the remaining 18 hypothesised variables, all provided some form of statistical explanation at the 90% level (or more) in one model or another. The results will help railways and transport authorities around the world manage station stops, through timetabling and appropriate investment.


Introduction
In urban areas, which are one of the rail mode's key passenger markets, line capacity is determined not so much by signalling arrangements as by what happens at stations. That time comprises the function/technical time needed for operation, but also the time needed for passenger flow (alighting and boarding), which is the focus of this paper. When railways are busy, line headways can easily be exceeded in track sections which include platforms. Not only is it common for where t p = passenger movement time, t f = fixed function time (e.g. door opening and closing), t v = variable function time, RORI = run-out/run-in time (the minimum platform reoccupation time permitted by the signalling), and H = planned service headway, but the variability and potential for delay associated with t p increases the likelihood of delays in corridors in which high levels of capacity utilisation may prevent easy recovery. This is clearly a key area of concern for many urban railways, and one in which much effort has already been made, including by the partners reporting this research.
The Railway Consultancy Ltd in UK is a specialist railway planning company who advise on both commercial and operational rail matters. This has led to a number of research papers on both sub-threshold delays (including those arising during station stops) and passenger movement rates, from work undertaken particularly in the UK, Norway and Germany.
The Transport Strategy Centre at Imperial College London is an applied research centre focussed on public transport economics, operations and management. It undertakes, amongst other projects, cross-departmental benchmarking between c. 90 companies who operate metro, light rail, suburban rail and national rail services around the world. As part of the operational analysis, however, that includes station stop surveys carried out in a similar style to those originally developed by London Underground (e.g. see [1], a paper which also sets out the context of these surveys, with some early results of different passenger rates). Those have formed the basis for a number of practical research papers establishing best practice for urban railways, including Canavan [2]. This paper is structured as follows: Sect. 2 summarises previous literature from the various main avenues of research, whilst Sect. 3 describes the manner in which data has been assembled for this analysis. Section 4 sets out our analytical method, with Sect. 5 providing the key results. Sect. 6 summarizes the conclusion with a discussion of these results.

Literature Review
Since the early work of Weston [3], considerable research effort has gone into this important subject. Researchers have generally followed one of four paths, as follows: (i) those using laboratory-based experiments with volunteers; (ii) those using data from real-world platform situations; (iii) those using higher-level statistical analysis from railway performance data; and (iv) those using simulations.
An interesting omission from this list-so far-is the use of 'big data' automatically collected from passenger counting systems and video monitoring on an ongoing basis.
We do not cover the simulation literature in this paper, not least because it is based on the parameters which the other approaches seek to calculate directly, in a similar manner to which Alvarez et al. [4] have applied Weston's early formula to an entirely new context in Panama.
Several issues, however, are key to understanding the results from all methods. First, it is important to realise that, because movements can occur across many doorways simultaneously, it is only the busiest ('critical') door that is the determinant of t p . There are many practical ways of reducing this (see [5]), and it is the aim of this research to update evidence indicating the relative effectiveness of these. The two areas excluded from the current analysis are the management of the total number of potential passengers per train through different timetable design (frequency, stopping pattern, fares policy, etc.) and the distribution of passengers between doors (recently addressed by [6]).

Laboratory Research
In order to control elements of experimental design as much as possible, some researchers have tried to set up laboratory conditions to mimic the platform-train interface. Daamen et al. [7] reported results on the platformtrain gap and carriage of luggage, from the Stevin laboratory at Delft University.
The main stream of laboratory-based research has, however, been based around University College London (UCL)'s Pedestrian Accessibility and Movement Environment Laboratory (PAMELA) physical train mock-up and subsequent work set up at the University of Los Andes, in Santiago de Chile. Only the briefest summary of that work is included in this section, because the contribution of individual papers is introduced where relevant, in the variable-by-variable specification of possible parameters, which follows in Sect. 3.
Fernandez et al. [8] demonstrated increasing flow rates with increasing passenger density (which, given their definition, broadly equates to the relationships we show with increasing passenger numbers). They also found that flow rates are maximised with a small, as opposed to minimal, (vertical/horizontal) platform-train gap, although this goes against what one might expect from first principles of the understanding of human physiology [9].

Research on Operational Railways
Important contributions to understanding passenger movements on live railways have come from Heinz [9] in Sweden, Wiggenraad [10] and Daamen et al. [7] in the Netherlands, Lam et al. [11] in Hong Kong, and our own work around the world. The Transport Strategy Centre (TSC) has extensive experience in metro railways, especially the larger ones, whilst Railway Consultancy Ltd (RCL)'s work has generally complemented this with work for suburban and mainline railways. Some authors (e.g. Watts et al. 2018) also quote live measurements as part of their calibration for their laboratory research.

Statistical Analysis
Given what train operators know about the train and carriage loadings at/from previous stations along the route, and the relative busyness of stations and times of day, it is possible to provide some guidance about the expected durations of t p from statistical analysis based on past experience. For instance, Buchmüller et al. [12,Figure 5] clearly show the impact of passenger flow rates increasing with passenger quantities, as part of a wider quantitative assessment of station stop time data derived from automatic train-borne equipment.
Whilst not providing behavioural explanation, 'passengerdisregarded' aggregate statistical methods such as that by Li et al. [13] can provide reasonable estimates, which can be particularly useful if station-specific conditions make 'passenger-regarded' models difficult to estimate. Disregarded models (e.g. that of d'Acierno et al. [14] in Naples) can also enable the impact of service bunching to be demonstrated if headways (and hence per-train passenger loads) vary from those planned, which can be helpful. However, our aim was to enable railway operators and authorities to be able to estimate the impact of changes in location-specific characteristics, for which the disregarded models are inadequate.

Synthesis
A key issue particularly with the stream of laboratory research is of trying to examine variables in isolation. Their apparent importance could be misunderstood if correlations with other variables were not also taken into account. To an extent, these problems also occur with research on operational railways since, for instance, some train design features (e.g. longitudinal seating and large vestibules) are usually combined. For real value for those producing rolling stock and operating railways, it would therefore be helpful to be able to provide some form of more comprehensive approach which could put the importance of different factors into perspective.
A meta-analysis of data compiled jointly by RCL and the Railway and Transport Strategy Centre (RTSC) [15] enabled the first predictive model to be constructed, following a regression of 17 different variables.
Analysis based on this model has subsequently been used by RCL to advise clients, both in terms of rolling stock design and also to predict station stop times at specific locations. Unfortunately, the 2014 analysis had a number of weaknesses. For instance, insufficient constraints were used to limit some parameters to known directionality or size.
This current research is therefore designed to: (i) remove some analytical weaknesses with the 2014 model; (ii) incorporate some new independent variables which have been hypothesised to impact on passenger movement times; and (iii) recalculate values based on a larger sample (of 225, not 125, surveys).

Data Specification
In order to ensure the greatest likely level of explanation, this section summarises previous research on each of the independent variables considered. Various authors (e.g. [14,16]) have noted that the potential independent variables here fall into several key categories. These relate to fixed infrastructure, rolling stock, passenger flow and management, are summarised in Table 1 below, and form the basis of the descriptive structure which we have adopted here.

Quantity
Following the work of Wiggenraad [10], it is important to note the required measurement technique for this: the passenger flows included in this analysis only relate to clustered passengers (those forming a continuous flow), with 'late runners' being excluded. From first principles, the quantity of passengers would be expected to be a determinant of passenger movement times and rates. However, it is often assumed (e.g. [14,11,Figure 5]) that there is a linear relationship between the number of passengers and the time taken, but results from RCL and others (e.g. [17,18]) suggest otherwise (see Fig. 1, with time tapering off at the highest number of passenger movements). Luangloriboon et al. [19] found that passenger boarding rates increased with in-car passenger density, but only up to 3 passengers/m 2 , and not thereafter. Heinz [9, p 120] suggests that the linear relationship only applies at door widths greater than 1.8 m, a typical maximum width of a train doorway.
Christoforou et al. [20] calibrated a regression model on light rail traffic in Nantes (France) using the sum of boarding and alighting passengers together, but there are reasons to believe (e.g. the relative amount of space on platforms, compared to within trains) that the rates of movement of the two groups are not likely to be the same.
What d'Acierno et al. [14] did correctly identify, however, was that forecasts of movement rates and dwell times need to be time/train-specific. In the urban area they studied (Napoli, Italy), train capacity may be reached during peak periods, leading to passengers being left behind and therefore increasing the number of passengers requiring the following train. In suburban and mainline railways, a common problem is variations in stopping patterns and destinations, which lead to variable loads between successive trains (as well as passengers on platforms awaiting later trains but blocking the movement of passengers in and out of the current one).
However, even for a given quantity of people, their distribution along the platform and between different train carriages is critical. This was recognised in Weston's original [3] research, in which he used a general peakiness factor to relate the number of passenger movements at the busiest door to those at the average door. At specific locations, the exact values can be calculated, as Fox et al. [21] did for Oxford, Britain.

Passenger Interactions
In line with the seminal contribution of Fruin [22], one would expect from first principles that crowding hinders pedestrian flow. Indeed, crowding in the vestibules of public transport vehicles has long been recognised as following the traditional v 2 relationships familiar throughout transport planning. For instance, Weidmann [23, p 173] provided some values of the impact of congestion caused by passengers standing on buses. More recently, several authors have specifically analysed this issue. Fernandez [24] provided estimates of decreases in passenger alighting and boarding rates with increasing crowding levels on buses, whilst Palmqvist et al. [25] identified that, in Tokyo, increases in congestion levels of 10% added about 1 s to passenger movement times (and hence overall dwell times).
Weston's early work [3] also included an interaction term between the number of boarding and alighting passengers, to reflect the observed pedestrian conflicts. However, Harris [26] noted that this approach over-estimated station stop times at very busy locations (e.g. more than 20 boarders and alighters at the same door), hypothesising that  Source: [17] the large space vacated by the 20 alighters more easily enabled significant numbers of boarders to do so quickly. Lee et al. [27], working in Seoul (Korea), identified four different types of situations in which conflicts between boarding, alighting and standing passengers occurred, with different proportions of each resulting from the relevant train door from different platform entry and exit points. Overall passenger movement rates fell from c. 1 passenger/ s as the number of remaining passengers fell. This is supported by the hypothesis of social pressure, i.e. that where there is room to do so, passenger movement is quicker when individuals feel pressured to hurry by those behind them.
Seriani et al. [28] investigated the impact of the boarding-to-alighting ratio, since one might hypothesise that equal numbers of boarders and alighters provide the maximum conflicting interaction, which a predominantly unidirectional movement would avoid. However, their results were not clear-cut, perhaps reflecting the possibility that a dominant movement of alighters can switch to a dominant movement of boarders at any stage during the station stop.

Passenger Characteristics
Wiggenraad [10] disaggregated his results by type of passenger (peak/off-peak), but actually found almost no difference in passenger movement rates; however, passengers on intercity trains were demonstrably (10-15%) slower than on local trains, perhaps reflecting variables such as lesser journey familiarity and more luggage, as well as any rolling stock-related factors. In their laboratory-based experiments, Daamen et al. [7] reported reductions in passenger flow rates of around 25% for participants with luggage. Following a programme of work for several British train operators, Harris and Ehizele [29] isolated the impacts of large luggage on passenger movement rates. Reductions in passenger flow rates ranged between 10 and 83%, depending upon situation-specific circumstances such as the size of the platform-train gap (and probably also the stowage arrangements of luggage within the train, although it was not possible to test that). The higher values appear to be realistic where free flow throughout the alighting or boarding process would otherwise have been possible.

Doors
Railway managers have understood for some time that increasing the proportion of train length comprised of train doors enables a greater throughput of passengers. This is behind the recent decision in London for Crossrail trains to have three, rather than the British-conventional two, doors per car and the longer-established development of trains with up to five doors per car in the Asian conditions of highest demand in Shanghai and Hong Kong. This can be the single most important element of design to improve passenger flow rates, and the arithmetic is very simple: 50 passengers alighting at 1 passenger/s through two doors take 25 s, but through five doors take only 10 s. However, it is less clear whether movement rates per se are also affected.
A related variable is that of car area/door width, as the total vehicle capacity is a function of car volume not just car length. This attempts to make some adjustment for car width which (in our sample) ranges between 2.34 and 3.6 m.
The distance between doors was previously studied but failed to demonstrate a relationship for boarding, even though it seems plausible that marking platforms to inform passengers about the expected position of train doors would be helpful. However, distance between doors was found to be statistically significant for alighters, presumably reflecting the longer time taken to access doors from more distant locations within the train, and therefore the difficulty in sustaining optimum flow rates throughout this process. This is also relevant to the discussion on aisle widths (see below).
From first principles, it might be thought that increasing the door width of trains improved passenger movement rates. However, this is not a linear relationship. Heinz [9] postulated that rates were a function of 1/W 2 , whilst Harris et al. [30], based on practical work in Norway and elsewhere, demonstrated that the relationship was not monotonic. Instead, a step function could be seen, corresponding to passengers' ability to pass through a door one at a time, two at a time, or three at a time. This confirmed the hypothesis of Heinz [9,Fig. 31]. The maximum (saturation) flows identified by Fernandez et al. [8] also imply non-linearities, which vary by platform-train gap.

Vestibules and Standbacks
Creating more space in vestibules and, in particular, behind the doors (so that passengers not wishing to leave or board the train can keep out of the way of those who do) would be expected to lead to an increase in passenger movement rates. Interestingly, both Harris et al. [15] and Seriani and Fujiyama [16] only managed to isolate an effect for boarding passengers, and even then the impact was relatively small (only around a 3% rate enhancement per 10-cm increase in the width of the standback).
Understanding that the two variables are often considered jointly by train designers, researchers at UCL [31] undertook one range of experiments attempting to find the optimal sizes of both door widths and standbacks.
Although only analysed within a narrow range, optimum results from their PAMELA mock-up were for widths of 1.7 m and standbacks of 0.3 m. Standbacks and vestibule capacity were both previously analysed [15], but we have now developed a new variable relating to the vestibule capacity actually available, by subtracting from the vestibule capacity the number of passengers recorded as remaining within the vestibule during a station stop. Such a variable unfortunately suffers from the variability of on which side of the train the next platform is, and whether those in the vestibule know this.

Seating Arrangements
Increasing the circulating space of rolling stock would theoretically be expected to increase passenger movements through it, and this can be achieved by amending seating arrangements. For instance, Coxon et al. [32] noted the potential of longitudinal seating (i.e. that along the vehicle body side only), but did not quote any specific results. Harris et al. [15] included in their overall regression terms for car area, and the % of seats which were longitudinal, but subsequent data collection enables us to examine other variables. Information on the number of seats per vehicle, together with its area, now enable us to examine the impact of seating density: one might hypothesise that passenger movements take longer within a train space which includes more obstacles (seats).

Aisle Width
During field observations, there have been times when passenger flow through the train doors has been freeflowing, but boarders are clearly queuing back from an internal obstruction. Detailed investigation showed that there are many types of rolling stock in which the dimensions of the aperture between vestibule and passenger seating ('saloon') area are smaller than those of the train door. This is the case even in many urban rail fleets where there is no actual door into the saloon.
We have therefore hypothesised that the width of the aisle might affect passenger movement rates, both for boarding (as described above) but also potentially for alighting, since it might reasonably be understood to limit the free flow of passengers towards the train doors.
To our knowledge, there have been no previous attempts to estimate the impact of this parameter. What is less clear a priori is what the nature of the relationship might be. We undertook some preliminary analysis to see if its impact on boarding was directly related to vestibule occupancy, on the basis that the aisle width bottleneck would be expected to become more apparent more quickly when the vestibule already had some standing passengers; however, this did not turn out to be the case. This is perhaps because some passengers will seek out a seat, even if there is plenty of space to stand in the vestibule, whilst others will be prepared to stand, even if seats are readily available.
Whilst recognising the difficulty of estimating the aisle width of train fleets with longitudinal seating (we have made a 30-cm allowance for the feet of passengers on each side of the train), separate variables have been developed for the aisle width at the point of entry from the vestibule, and its average elsewhere. These clearly differ in some types of rolling stock (e.g. see Fig. 2). We have taken the narrowest value of each variable, although it is arguable that aisle width is most critical at hip height, since this is where most humans are widest.
We also tried to develop a variable which might measure the relative difficulty of accessing/exiting the main passenger seating areas. This clearly only applies to passengers actually trying to do so, i.e. those in excess of the number remaining in the vestibule. The 'saloon exit difficulty' variable was therefore defined as: Number of alighters=boarders À number in the vestibule Aisle width net of space for feet; where seating is longitudinal ð Þ The variable takes a zero value if the number of alighters/ boarders is less than the capacity of the vestibule, although we recognise the caveat noted above regarding the relative choice of seats and standing space. The variable is maximised if there are many excess passengers, but the rolling stock has narrow aisles. It is recognised that this concept is perhaps more pertinent to boarding, rather than alighting, since alighters perhaps do not crowd in the vestibules to the same extent as boarders.  [33] provide some quantified data for the mainline station at Oxford (Britain), with Fox's paper on the same research [21] highlighting deficiencies in the information about train length as a contributor to the uneven loading resulting from a limited number of platform access points. In order to minimise their walking time along the platform, passengers alighting from trains often do so from carriages nearest their preferred exit, but the purpose of inclusion of this variable here is to see if the rates of boarding and alighting through the chosen door are affected by the distance from the entrance/exit.

Platform Width
Platform width may impact on boarding and alighting rates if there is insufficient space to accommodate both streams of passengers. It may also reflect on-platform conditions which may force some potential boarders to stand further away than they would like from the optimal door position for boarding. Whilst recognising that measuring platform width is slightly arbitrary (we have generally taken the value at the location of the critical door, even if there is considerable variation in width along the length of a platform), our previous research did demonstrate that platform width was a statistically significant determinant of boarding rates.

Double-Sided Platforms
It has long been understood that the provision of separate platforms for boarding and alighting separates different passenger flows, and substantially improves passenger movement rates. Unfortunately, the cost of constructing these prevents their use in only the busiest of locations. However, our data set does include observations at several of these (Sé in Sao Paulo, Stratford in London, and several stations in the core section of the Munich S-Bahn), and our previous research showed this to be a statistically significant determinant of alighting (presumably because an unimpeded platform is available immediately upon disembarkation).

Platform-Train Gaps and Steps
The platform-train interface (PTI) is well known to be a major source of risk to railway passengers [34], so understanding how to maximise passenger flow across it is a matter of safety, as well as passenger flow, importance. Heinz [9] provided useful physiological context about the way in which the human brain is able to walk without mental effort, provided that gaps in the surface are typically less than 15 cm. She noted that the impact of height differences (H) on passenger flows in Sweden was best fitted by a function based on H 4 . Observations made by RCL at British stations with large platform-train gaps [e.g. Bank (Central line); Clapham Junction (platform 15, now substantially improved)] show passengers spending time assessing the gap to be traversed, and what measures they need to take to mitigate any risk and manage their own propulsion: this includes finding grab rails to provide extra 'pull'. Sadly, we have also witnessed people unable to support their own weight plus the extra force generated by the gravity of stepping significantly downwards. Our own programme of work for Crossrail [35] identified a 4% reduction in passenger flow rates from an increase in the Pythagorean platform-train gap of 10 cm, following Weidmann's [23, Table 4.3] summary of similar impacts across a range of German rolling stock types. Daamen et al. [7] demonstrated that larger horizontal and vertical gaps both impacted on passenger flow rates, by around 5% and 10%, respectively.
However, in a matrix of experiments carried out in PAMELA, Fernandez et al. [24] did not always find that passenger movements were maximised with the smallest gaps, and, indeed, it appeared that slight vertical gaps increased movement speeds. This is also in contrast to our previous analysis including data relating to horizontal, vertical and Pythagorean gaps, of which the vertical gap variable was found to be statistically significant-but only for alighting passengers. Two key points need to be made from this. First, one needs to be careful in establishing whether the gaps are estimated from vehicle dimensions or recorded during surveys, as loaded trains rest lower on the suspension, reducing (if not eliminating) the 20-mm theoretical gap commonly planned for. Secondly, there has hitherto been no obvious theoretical explanation for this, although we can now proffer some: a small gap may encourage passengers to take a slightly longer stride, thereby effectively leading to an increase in speed compared to a normally shorter one. Alternatively, perhaps people expect a small gap and mentally slow themselves to question if there really is almost no gap. It is clear that greater understanding is needed of this topic.
Seriani et al.'s latest work [36] concentrated on the impacts of horizontal and vertical gaps on groups of passengers with different levels of mobility, highlighting that even quite small gaps can prevent wheelchair users crossing the platform-train gap at all. This implies that the impact of gaps for more able-bodied people might also not be linear.
Height differences in/out of trains may be adjusted for by the use of steps, either within the vestibule, or outside the main car body (perhaps extending as the doors are opened/retracting as doors close). Holloway et al. [37] used the PAMELA laboratory to undertake some structured experiments about the impacts of luggage and steps in/out of the train. Additionally, demonstrating that the impact of luggage increased with an increasing number of steps, the number of steps also had a proportionately greater impact on older passengers.

Platform Edge/Screen Doors
Doors or screens to protect the PTI are increasingly popular at the busiest urban stations, to improve safety, reduce litter being blown onto the tracks, etc. Our large sample size should enable the regressions to distinguish their impact separate from that of sheer passenger numbers.
Despite Heinz's physiological explanation suggesting that there should be a negative impact of platform edge (= platform screen) doors on passenger movement rates, Rodriguez et al. [38] claimed to find that, if anything, platform edge doors (PEDs) improved passenger flow rates, However, their analysis seemed to compare a station with no PEDs but a 170-mm gap (Green Park, London) with one with PEDs and a level train entry (Westminster), whilst their monitoring also included gaps in passenger flow of up to 10 s (a level of variability which could surely dwarf the results). It is also unclear to us how they controlled their results for changes in passenger quantity, and so how they made their deductions about impacts on passenger movement rates.
Somewhat similarly, results from Thoreau et al. [31] were inconsistent with this variable. They noted that passenger movement rates across a wider 200-mm gap seemed to increase with platform screen doors, but (as theory would suggest) rates with a 75-mm gap and PEDs were similar to those in situations with a wider 200-mm gap and no doors, implying a reduction in rates with platform screen doors.

Inside/Outside
Although weather is an environmental condition, its effects can be mitigated through investment in infrastructure such as platform canopies (although, to avoid contact with overhead electrification systems, these do not always extend sufficiently to cover the PTI itself). One can conceive of situations in which passenger movement rates are slower for outside, as opposed to inside, conditions. For instance, during icy weather, passengers might step in/out of the train more carefully, whilst if wet, passengers might not wish to stand on the platform and ensure maximum social pressure on boarding rates, instead possibly folding up their umbrellas as they board.

Management Actions
Railway companies would hope that the extent of direct supervision of passenger movement would also impact on pedestrian movement rates, but defining management actions to create an independent variable is not easy. Our previous regression [15] attempted to score management action on a 1-5 scale, which included measures such as levels of staffing, platform markings, physical barriers, etc., but this did not turn out to be a statistically significant variable.
Kuipers et al. [6] provide some examples from the literature of savings in overall passenger movement time from management actions, such as controlling the location of queuing and providing information about departure times (to limit late-running passengers) and train crowding conditions (to encourage passengers to spread out along the train). However, whilst all these measures are certainly important in reducing the variable element of passenger movement time, it is less clear that they affect the rate of movement during clustered boarding or alighting, which is the focus of this analysis.
For the avoidance of doubt, impacts of timetable structure, etc. in trying to minimise the total number of passenger movements per train call are excluded from this analysis.

Summary
Whilst not exhaustive, the following table provides a summary of some of the more recent single-variable relationships found by previous authors. Data is shown in two forms: absolute values (e.g. Daamen found an average passenger movement rate of 0.91 passenger/s in [7]), and proportionate values (e.g. Harris et al. [15] found that an increase in the number of alighters/door from 10 to 11 increased alighting rates by 8.8%) ( Table 2).

Analysis
This section sets out how the data has been collected and analysed. In particular, it sets out how scatter plots, singlevariable analyses, data transformations and correlations have been used to understand which (functional forms of which) variables are likely to contribute most to the statistical performance of the model developed.

Data Collection and Programme Selection
Data collection for all the surveys has been undertaken using a standard procedure originally developed by London Underground, which uses two observers focussed on the critical door of the train. The first observer records the numbers of passengers alighting, boarding (clustered passengers only, i.e. excluding any 'late runners'), remaining in the vestibule and any left behind. The second observer records, to the nearest second, the times of wheel stop, door open, last alighter, first boarder, last clustered boarder, door close initiation, passenger door closure, last door closure (elsewhere on the train) and wheel start. Either before or after the capture of passenger data, data can be acquired for platform and train characteristics, either from direct measurement or engineering drawings. It is important to note that, as almost all surveys comprise over 30 observations, this paper is something of a meta-analysis, since it includes over 6000 data points.
Because we landed up with over 30 variables, the issue of measurement error needs to be considered. Some variables (e.g. those relating to rolling stock dimensions) are associated with data at very high levels of accuracy. Other variables become accurate as a reflection of the method of measurement: for instance, many data elements only refer to the critical door on the train, and therefore no averaging between doors is necessary. Some of the remaining variables may be less accurate (e.g. aisle width net of people's feet might be deemed inaccurate, as people's feet are not identical in length), but the use of a small team of surveyors reduces inter-surveyor discrepancies, whilst the iterative nature of our analytical method means that those without significant variation (hence statistical impact) tend to be dropped from the model, with only impactful variables remaining.
In order to apply different multiple regression models to the data available (or subsets of it), the R statistical package has been used. R is a system for statistical computation and graphics, providing a programming language, highlevel graphics, interfaces with other languages and debugging facilities.
R has been used for this analysis for different reasons: first, it has been designed to handle large data sets (larger than Excel, for example) and can still run efficiently while doing so. Secondly, R is also able to create advanced data visualisations for complex data sets. Thirdly, the R code generated is reproducible: indeed, the statistical code developed can be applied to any data sets with only a few changes to code and reference data, which made it particularly efficient for disaggregated analyses of parts of the overall data set.

Data Transformations
The aim of the analysis is to find a relationship among the dependent variable (alighting or boarding rate) and the combination of independent variables having an influence on the dependent variable. In order to achieve this aim, independent variables not having a relationship with the dependent variable need to be excluded from the analysis. A full list of the variables available for analysis was shown above in Table 1; we comment here only on variables for which transformation was required.
(Number of) Train Steps 1 : This variable has been deleted, as the corresponding binary variable appeared to be a better representation of the reality. Indeed, the analysis of the train steps variable is shown in Fig. 3.
There were 165 examples in the data set with zero train steps, and these were, on average, associated with an alighting rate of 1.12 passengers/s. However, the situation of having five steps has only been observed four times and is associated with an average alighting rate of 1.21 passengers/s, apparently implying passengers alight faster when five steps are present than when no steps are present, which is improbable. Even if a situation associated with only four observations is reliable, we cannot judge whether the result is true or not or whether, had we had 165 observations of it, we would not find the same result in terms of alighting rate. Including the variable leads to a counter-intuitive implication that more steps would increase the alighting rate. The insufficiently small sample of circumstances associated with the higher number of steps therefore led to the decision to only consider zero steps and one or more steps, where enough data are available to compare. Consequently, the 'train steps' variable has been removed, and the corresponding binary variable has been used instead. Standback width 2 : The standback width variable seems to be well represented across the sample with no obvious outliers. The standback curve shows that the alighting rate increases up to a standback width of 0.25 m, then it decreases until the standback is equal to around 0.6 m (perhaps as two passengers try to stand in it?), to then increase again, when the standback width is equal to 1 m or wider (Fig. 4).
This could be explained with the phenomenon of passengers being able to alight the train quickly when a small standback is available (e.g. for minor luggage). However, with a slightly wider standback (say between 0.25 and 0.6 m wide), passengers wishing to alight at the next station may remain in it, blocking the passage of some alighters. When the standback is really large, however, observations suggest that there is both sufficient room for passengers to remain in the standback whilst alighters (and boarders) can move unimpeded. Consequently, this variable has been replaced with a step function, showing the following values for the following ranges: 1 for standback y = -0.0503x + 1.0771  width \ 0.25 m, 2 for standback width between 0.25 and 0.60 m, 3 for standback width between 0.60 and 1 m, and 4 for standback widths bigger than 1 m (reflecting multi-user space). DoorWidth 3 : For the same reason explained above for the standback width, the door width has been replaced by a door width_step_function (Fig. 5). Whilst the exact thresholds are arguable, we have taken a function showing the following values for the following ranges: 1 for 'DoorWidth' less than 0.90 m, 2 for DoorWidth between 0.90 and 1.30 m, and 3 for DoorWidth bigger than 1.30 m.
Even after these deletions of potential variables, 34 potential explanatory variables remained available for analysis.

Scatter Plots and Functions
Scatter plots have been used to observe relationships between variables and to help understand the hypotheses derived from previous work regarding some of the variables. Separate scatter plots between the dependent and each of the independent variables are shown in the supporting data document, for alighting and boarding separately. For ease of reference, the best single-variable functions are listed in Appendix A, using more complex relationships where appropriate.
In many cases, logical relationship and/or previous research have determined the type of relationship between variables, both in terms of direction (directly/inversely proportional) and sometimes also in terms of shape (e.g. linear or not). A function, chosen among exponential, linear, logarithmic, polynomial (assumption: maximum second order) and power, has been selected to express the relationship between the dependent variable (corresponding to the alighting/boarding rate) and each of the independent variables, keeping in mind not to add complexity to the system, if not needed.
There are a few cases where there is no evident curve fitting the data provided (confirmed by a very low R 2 for polynomial functions). No function has been associated to them. This reduces the potential number of explanatory variables to 28.

Single-Variable Analysis
A variable-by-variable analysis produced the results shown in Table 3 for all independent variables, regressed separately and linearly.

Correlations
Correlation matrices have been developed to show the relationship between the 28 independent variables selected above, differentiated by alighters and boarders. The matrices identify the independent variables strictly correlated to each other, which may need to be excluded from the next steps of the analysis to avoid multicollinearity issues.
Results obtained (in absolute value, as the sign only indicates if the relationship is directly or inversely proportional) are shown in Tables 4 and 5. The variables that are strictly correlated to each other are in red, the variables that are less correlated to each other are in yellow and the variables that are almost not correlated to each other are in green.
Unlike some statistical analyses, however, it is appropriate to include within the data set for analysis variables which are closely correlated but nevertheless are independent. For instance, door width and aisle width are rolling stock variables which can be designed to be independent, even if metro trains normally have larger values of both than long-distance rolling stock. Similarly, inter-door distance is correlated with the 'outside protected dummy', because these are conditions which generally prevail more on mainline railways, but they can nevertheless be addressed separately.
However, among the 28 variables taken into consideration, some are clearly alternative formulations of the same measure (e.g. 3a, 3b, 3c), where it would be inappropriate to include all formulations. Among the alternative formulations of the same measure, the formulations strongly correlated (correlation [0.5) to fewer other variables have been selected for one of the following regression models, leaving 18 separate and uncorrelated variables. However, other models including all variables have been developed as well, and results have been compared. 3 Note that the graph and table above represent the relationship between door width and alighting rate, but a similar situation refers to the relationship between door width and boarding rate.

Stepwise Regressions
The method used for the selection of the independent variables to include in the analysis was stepwise regression, which is a systematic search algorithm to find the best model. The stepwise regression performs the searching process automatically and selects the best combination of independent variables that describe the dependent variable. The selection of the best variables follows the Akaike information criterion (AIC), and the direction of the regression is both forward and backward selection. It means that, starting with no predictors, the most contributive predictors are sequentially added (like forward selection). After adding each new variable, any variables that no longer provide an improvement in the model fit are removed (like backward selection). This iterative approach considered the statistical significance of all variables at each stage after recalculation; only variables with p-values \ 0.1573 (the default in the software used) were then omitted, whether these had been significant in previous iterations or not. The summary of the stepwise regression shows the selected best variables and related parameters. An alternative approach would be to consider the variables which are most easily addressed practically, an issue to which we return in Table 9. Different stepwise regression models have been developed, all of them including a non-null intercept. Indeed, since very low numbers of passengers have already been taken into account (by excluding all values less than 1 from the original database), the models will include an intercept. Note that the models below, including an intercept, are only relevant when there is at least one passenger movement per door. The different models developed are as follows: • Multiple regression 1: multiple linear stepwise regression between the dependent variable and the 34 original independent variables, with a non-null intercept. Each of the variables introduced in the model is assumed to be linearly related to the dependent variable. Results of the statistically significant variables are shown in Appendix A. • Multiple regression 2: multiple linear stepwise regression between the dependent variable and the 18 selected independent variables (selection based on the correlation matrix), with a non-null intercept. Each of the variables introduced in the model is assumed to be linearly related to the dependent variable. • Multiple regression 3: multiple polynomial stepwise regression between the dependent variable and the 18 selected independent variables (selection based on the correlation matrix), with a non-null intercept. Each of the variables introduced in the model is assumed to be represented by a function, corresponding to the function expressing the relationship between the dependent variable and each independent variable (see 'Scatter Plots and Functions'). • Multiple regression 4: multiple polynomial stepwise regression between the dependent variable and the 28 selected independent variables (selection based on the scatter plots), with a non-null intercept. Each of the variables introduced in the model is assumed to be represented by a function, corresponding to the function expressing the relationship between the dependent variable and each independent variable (see 'Scatter Plots and Functions').

Comparison of Models and Choice of the Best Model
Three primary criteria were used to determine the model variant deemed to be the best: (i) Significance of the entire regression and each independent variable • The F-statistic is a test of significance for the entire regression. If the p-value of the Fstatistic is \ a, the model is a-significant. This means that at least one of the independent variables is significantly related to the outcome (or dependent) variable, being different from 0. • In terms of contribution of each of the independent variables in the prediction of the dependent variable, it is important to focus on the t-statistic and p-values of each independent variable.
(ii) Model accuracy assessment The overall quality of the model can be assessed by examining the multiple R-squared (R 2 ), adjusted R-squared and residual standard error (RSE).
• In multiple linear regression, the multiple Rsquared represents the correlation coefficient between the observed values of the outcome variable (y) and the fitted (i.e. predicted) values of y. It is a measure of the 'explained variation' and provides the percentage of the total variation in alighting rate that is explained by the regression. • The adjusted R 2 is a correction for the number of independent variables included in the prediction model. It is a measure of the percentage of the total variation in alighting rate that is explained by the regression, considering the number of independent variables included in the model. • The RSE estimate gives a measure of error of prediction, or 'unexplained variation'. The lower the RSE, the more accurate the model.

(iii) Variable coefficients
The equation representing each model is: For a given independent variable, the coefficient can be interpreted as the average effect on the dependent variable of a one unit increase in the independent variable, holding all other independent variables fixed. The higher the absolute value of the coefficient, the stronger the influence of the independent variable on the dependent variable.
Also, a positive b shows a directly proportional relationship between the dependent and independent variables, while a negative b shows an inversely proportional relationship between the dependent and independent variables.
Alighters: Table 6 shows the statistical results obtained for each model, indicating the following: • The p-value of the F-statistic is \ 2.2e-16 for all models, which is highly significant.     • The R 2 values should not be compared among the models above, as they involve a different number of variables. To take into consideration the variables included in the linear models considered, the adjusted R 2 should be compared. • The adjusted R 2 is generally very high, showing that, in all models, the total 'explained variation' in alighting rate is mostly explained by the regression, considering the number of independent variables included in the model. However, regression model 1 is characterised by the highest explained variation, meaning the 11 variables selected by the model are the ones that mostly explain the regression. Model 3 is characterised by the lowest explained variation.   • As expected, the following functions are directly proportional to the alighting rate: number of alighters, platform width and doors per car. • As expected, the following functions are inversely proportional to the alighting rate: vestibule load vs capacity, percentage with luggage, outside unprotected dummy and inter-door distance. • The following function is directly proportional, but it was expected to be inversely proportional to the alighting rate: horizontal step distance. The results of a further investigation are shown below.
The horizontal step distance is represented as follows (Fig. 6): The analysis of this variable is clearly affected by an outlier. The negative horizontal step distance (-20 mm) represents a location where the platform and train genuinely overlap. Although we could have deleted this outlier, in this case, the step distance was reset to zero, as the effective horizontal gap is zero (a vertical gap obviously remains). The corresponding b obtained from the model is not very reliable (a = 1), and the associated equation may not correctly represent the reality.
The values shown in the table above also suggest that a gap of 100 mm allows an alighting rate (equal to 1.17 passengers/s, on average) to be higher than the alighting rate associated with a distance of 50 mm (alighting rate equal to 1.11 passengers/s, on average). Whilst the alighting rate might theoretically be expected to decrease when the horizontal step distance increases, other researchers have also found a maximum passenger flow to occur with this size of gap.
At this stage, it was clear that we had a model with three very statistically significant variables (characterised by a = 0.001): NumberOfAlighters, VestibuleLoadVsCapacity and DoorsPerCar. However, a choice needed to be made between the different model formulations.
Boarders: Table 7 shows the results obtained for each model, indicating the following: • The p-value of the F-statistic is \ 2.2e-16 for all models, which is highly significant. • As before, a comparison on the adjusted R 2 values is appropriate, and these values are generally very high. This shows that, in all models, the total 'explained variation' in boarding rate is mostly explained by the regression, considering the number of independent variables included in the model. However, regression model 1 is characterised by the highest explained variation, meaning the 13 variables selected by the model are the ones that mostly explain the regression. Model 3 is characterised by the lowest explained variation. InterDoorDistance_m ? 0.0562685 VestibuleSi-ze_m2 -0.0055227 CarAreaVsDoorWidth_m -0.0348611 DoorsPerCar It shows the following: • As expected, the following functions are directly proportional to the alighting rate: number of boarders, platform width, vestibule size. • As expected, the following functions are inversely proportional to the alighting rate: vestibule load per passenger, percentage of passengers with luggage, train steps_binary, inter-door distance, car area vs door width. • The A_X_B function was found to be directly proportional with passenger flow, but was expected to be inversely proportional, reflecting increased boarding difficulty when there are many alighters. However, an alternative hypothesis is that a greater number of alighters creates more room for boarders to enter, so this latter hypothesis may indeed be more appropriate. • The following function is inversely proportional, but it was expected to be directly proportional to the boarding rate: doors per car. This variable requires further investigation (Fig. 7).
Further investigation should include the outlier in this data set. Ten doors per car are associated with a boarding rate of 0.46 passengers/s, which is the lowest value present in the table above. It might therefore consequently be considered an outlier and removed. In general, the boarding rate would be expected to increase when the number of doors per car increases. However, it does relate to a genuine type of British train (now scrapped): VEP (Vestibuled Electro-Pneumatic brake) train sets on train services south of London were one of several types in which slam doors were fitted to every seating bay (Fig. 8). This minimised  the number of passenger movements per door, which is perhaps the most critical variable, but management of which effectively precedes all the analysis shown in this paper. However, the lack of space between passengers sitting opposite each other meant that there was effectively no aisle space, and boarders/alighters had to be careful (hence the rate of passenger movement was much slower) in ensuring that they were not standing on the feet of other passengers. It is possible that the curve might become stable (but not decrease) when the number of doors is bigger than a certain threshold (which, in this case, corresponds to 3.5 doors). However, whilst included within the stepwise regression, the corresponding b obtained from the model is not very reliable (a = 0.1), so the associated equation may not correctly represent reality.
That model is based on the following variables as the most statistically significant (and characterised by a = 0.001): number of boarders, VestibuleLoad_pass, PlatformWidth_m, VestibuleSize_m2. The best models provided have been used to proceed with the disaggregation by type of railway (differentiating between alighters and boarders), conscious that some variables are more statistically significant than others (as they are characterised by a = 0.001).

Alighting
Model 2 is therefore regarded as the best model, giving For a typical situation with 10 people boarding through the vestibule 10% full, 5% of passengers having luggage, an 'outside' platform 3 m wide, a train-platform gap of 100 mm and a train with 2 doors/car, 10 m apart, the forecast alighting rate would be = 0.63 ? 0.25 -0.07 -0.06 -0.11 ? 0.12 ? 0.06 ? 0.23 -0.19 = 0.86 passengers/s. All the variables are as expected, with the exception of the horizontal stepping distance where (as discussed in Sect. 3.3 above) previous research findings have varied widely, and there is (as yet) no consensus as to the underlying theoretical basis for any hypothesis.

Boarding
Model 2 is again regarded as the best model, giving Again, all the results are as expected, bar one. Here, it is the 'doors per car' variable which is not directly intuitive: one would expect more doors per car to reduce access times to, and queuing in, doorways, and hence increasing passenger movement rates. As the model also includes the variable 'car area per door width' (if only significant at 95%), we suspect that the 'doors per car' variable is actually acting as a proxy for more crowded urban environments, since it is in such places that train operators employ rolling stock with more doors. This hypothesis is supported by the disaggregation of results by type of railway (see Sect. 6), where this variable only occurs with this sign in the highest-frequency metro environment, and is of the opposite sign in the medium-frequency category.
Several of the variables produce results which are directly comparable with movement rates quoted by other authors. In terms of luggage, the limitation of 'luggage' to large items, and the train types involved may explain some of the differences (in particular, steps are of greater impediment to passengers with luggage), but these recent results are in the upper half of other research. If one assumes a maximum vestibule capacity of 25 passengers and a typical passenger movement rate of 1 passenger/s, then our results for boarding closely match those of Palmqvist et al. [25], but are higher for alighting (since each extra person in the vestibule takes up about 4% [100%/25] of its space). Other variables previously found to be statistically significant (e.g. standback width) have not provided such results here (Table 8).
In order to put impacts into perspective, Table 9 below sets out the measures which might be taken by a train operator or infrastructure company to increase passenger movement rates by 10%. This gets away from the trying to compare the parameter values of the model directly (since these are determined by the units of measurement) and helps to understand the real impacts of the results, for practical application: some variables are simply more easily amended by 10% than others. In these calculations, we have assumed typical movements of 10 alighters or boarders, and vestibules with a maximum capacity of 25 passengers.
Two of the values are slightly counter-intuitive. The regression parameter for horizontal stepping distance is positive, but only slightly, and the parameter is small; there is some evidence (e.g. see Fig. 6) that, whilst small horizontal gaps are indeed of negligible impact on passenger flows, larger ones have more severe negative impacts. Whilst one might have thought that increasing gaps would lead to decreasing flows, these results could possibly reflect the (delay of the) mental decision required with very small gaps, when for some people it could be possible to step on the platform and the train simultaneously.
The negative parameter for the 'doors per car' variable for boarding is thought to be reflecting the more crowded conditions in which train operators choose vehicle designs with more doors. It is not recommended that railway companies reduce the number of doors per car, although Heinz [9] did demonstrate that it is the total amount of door width/car (area), rather than the width of each individual door, which is important.

Comparison with Previous Models
The output from the 2014 model has been widely used commercially, for instance in the scoring of bids for rolling stock tenders for new trains. It therefore seemed appropriate to see whether the latest analysis was producing similar results and hence similar appraisals of specific situations. The parameter values are compared in Table 10. Note that some variables have been consistently included with similar parameter values, whilst others have entered the model or been dropped from it as the database has been expanded.
Re-estimation with a larger data set seems to have had several effects, including the following: (i) confirmation of parameters and their values (e.g. platform width for boarding, was 0.056, now 0.061); (ii) confirmation of concepts but measured slightly differently (e.g. total vestibule size now being preferred to explain boarding rates, rather than just the standback); (iii) new variables genuinely adding value (e.g. the % of passengers with heavy luggage).
The level of explanation in the model, as measured by R 2 , is broadly the same.
Analysis of a range of rolling stock designs proposed in a European country showed that the results for alighting rates from this model were extremely similar to the previous one, but those for boarding were 10%? higher, largely due to a better understanding of the impact of large vestibules.

Discarded Variables
In Sect. 4.5, we set out our approach for finding the best model, an approach which started by including all 34 variables for which data was available. However, a few variables did not provide any statistical explanation at all, and were therefore excluded from further analysis. Some of the variables were different functions relating to the same underlying concept, so it was an advantage in being able to select one variant. Nevertheless, two variables were excluded entirely, and are worthy of particular comment. The use of a double-sided platform has long been regarded by metro engineers and operators as a positive contributor to minimising station stop times, on the basis of minimising the interactions between alighting and boarding passengers. We have not managed to demonstrate that, possibly through the small number of observations with it (11). An alternative explanation for this might be the difficulty of including within the model two different door opening times: it is common practice at such locations to open the door for alighting first, and only to open the door for boarders when alighting is largely complete.
The use of platform edge/screen doors has been posited in previous research both to reduce and to increase passenger flow rates. Single-variable analysis suggests a positive coefficient of 0.3 for both alighting and boarding, but multi-variable analysis here shows that we cannot identify any impact to be statistically significantly different from zero when other variables are included.
Several variables have not 'made the cut' into the final smaller models, in order to satisfy statistical efficiency tests. Nevertheless, it is worth noting their continuing

Conclusions and Recommendations
This paper reports the statistical meta-analysis of passenger alighting and boarding rates from a large international data set of over 200 surveys reflecting over 6000 measurements of individual trains. Data was available for 34 different variables, reflecting characteristics of passenger flow, rolling stock design, infrastructure and management. Single-variable regressions produce statistically significant coefficients for most of these, giving guidance to train specifiers, designers and operators. A methodology has been developed which has been tested, works well with the data provided and can be repeated with a different data set. A stepwise regression method using the R statistical package was then used to develop a more tractable model with fewer variables. This process eliminated those variables shown to provide no statistical explanation (including the presence of platform edge doors). Of the remaining 18 hypothesised variables, all provided some form of statistical explanation at the 90% level (or more) in one model or another.
The method has enabled the analysis of an international data set to determine the factors most influential in affecting the rates at which passengers alight from and board trains. The resulting relationships are as follows:  Tables summarising the variables tested and their significance in different models are shown below. Where we have never achieved statistical significance for a variable across multiple model types, we nevertheless thought it would be helpful (for completeness) to provide some form of estimate, shown with a note in italics. These are based on: (a) estimates from the 2014 model; (b) estimates from single-variable analysis; (c) non-statistically significant parameter values from the main analysis here (Tables 11 and 12).  However, in summary, the preferred models demonstrated passenger alighting rates to be primarily a function of: • the number of alighters; • vestibule load/capacity; • % of passengers with luggage; • whether or not the platform is unprotected from the weather; • platform width; • the horizontal stepping distance between the platform and the train; • the number of doors per carriage; and • the distance between those doors.
Those variables shown in bold have the greatest statistical significance. The positive parameter for horizontal stepping distance appears to be counter-intuitive, but is small and also reflects values found by other researchers.
Passenger boarding rates are primarily a function of: • the number of boarders; • the product of the number of alighters and the number of boarders; • the vestibule load (passengers who neither alight nor board); • % of passengers with luggage; • platform width; • vestibule size; • whether there are any steps in the train doorway; • the number of doors per carriage; • the distance between those doors; and • the ratio of car area to door width.
Again, those shown in bold have the greatest statistical significance.
In terms of future work, the database used should be increased in size to obtain more reliable results, especially for some of the more disaggregated analyses (e.g. by type of railway), where fewer data points are available in each category. Specific issues are worthy of further attentionfor instance, the use of platform screen doors and doublesided platforms, both of which features have historically been understood to be helpful to passenger movement rates, but which have not been proven here. Nevertheless, many of the variables shown as statistically significant for the entire data set also feature in disaggregated models, and with similar parameter values. Others would not necessarily be expected to provide the same level of explanatory power in subsets of the data, if they are inherently less important (e.g. heavy luggage is proportionately more important for longer-distance rail journeys).

Declarations
Conflict of interest The authors are not aware of any competing interests relevant to this research.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons. org/licenses/by/4.0/.