Cost and time damping: evidence from aggregate rail direct demand models
 1.1k Downloads
Abstract
There is a significant body of evidence from both disaggregate choice modelling literature and practical travel demand forecasting that the responsiveness to cost and possibly to time diminishes with journey length. This has, in Britain at least, been termed ‘Cost Damping’, and is recognised in guidance issued by the UK Department for Transport. However, the consistency of the effect across modes and data types has not been established. Cost damping, if it exists, affects both the forecasting of demand and our understanding of behaviour. This paper aims to investigate the evidence for cost and time damping in rail demand using aggregate rail ticket sales data. The rail ticket sales data in Britain has, for many years, formed the basis of analysis of a wide range of impacts of rail demand. It records the number of tickets sold between station pairs, and it is generally felt to provide a reasonably accurate reflection of travel demand. However, the consistency of these direct demand models with choice modelling and highway demand model structures has not been investigated. Rail direct demand models estimated by ticket sales data indicate only slight variation in the fare elasticity with distance, as is evidenced in the largest metaanalysis of price elasticities conducted to date (Wardman in J Transp Econ Policy 48(3):367–384, 2014). This study of UK elasticities shows strong variation between urban and interurban trips, presumably a segmentation at least in part by purpose, but less remaining variation by trip length. A lack of variation by length supports the hypothesis of cost damping, because constant cost sensitivity would imply that fare elasticity would increase strongly with distance, because of the increasing impact of higher fares at longer distances. In this paper we indicate that rail direct demand models have some consistency of behavioural paradigm with utility based choice models used in highway planning. We go on to use rail demand data to estimate time and fare elasticities in the context of various cost damping functions. Our empirical contribution is to estimate time elasticities on a basis directly comparable with cost elasticities and to show that the phenomenon of cost damping is strongly present in ticket sales data. This finding implies that cost damping should be included in models intended for multimodal analysis, which may otherwise give incorrect predictions.
Keywords
Demand analysis Rail demand Model estimation Cost dampingIntroduction
There is a significant body of evidence from both disaggregate choice modelling literature and practical transport modelling that the sensitivity to cost and possibly to time diminishes with journey duration; Daly (2010) and Rich and Mabit (2013) give reviews, indicating that the effect may be caused by heteroskedasticity in the travelling population. This has, in Britain at least, been termed ‘Cost Damping’, and has entered demand analysis and appraisal guidance issued by the Department for Transport (DfT 2014).
If such relationships exist in disaggregate data and in car trip matrices, we would expect to find them in other forms of behavioural response. The paradigm of utility maximisation, for example, applies in principle to all choices and travel behaviour. One source of evidence on behavioural response is rail ticket sales data. In Great Britain there exists a large and accurate record of the number of rail trips made between many pairs of stations and this has been extensively exploited to support a wide range of modelling opportunities. As a result there is extensive evidence on how rail demand is influenced by fares, journey time and economic activity, and to a lesser extent the impact of competition from other modes and other service quality factors (ATOC 2013).
In this paper we examine whether there is evidence to confirm such cost damping effects in rail ticket sales data covering a very wide range of flows and many years. The existing results indicate that there is limited variation with distance in the elasticity of demand with respect to fare (‘fare elasticity’). Simply, we would expect that fare elasticity should become stronger (more strongly negative) with distance, because longer trips have generally higher fares, so that (in a linear model) a proportional increase on all fares has more impact at longer distances. So a finding of limited variation with distance gives evidence in support of the hypothesis of cost damping. That is, the response mechanism in the model is not linear but shows a decreasing proportional sensitivity to fares as fares increase. In this paper we present further analyses of this issue and compare this with the disaggregate evidence.
The importance of these findings would be to show the consistency of behaviour across different modes and data types, so that consistent utility functions can be used. Practically, the findings indicate how fare elasticity varies with trip length, suggesting ways in which fares could be set to optimise revenue. For modelling, the results indicate that cost damping should be included in models of travel demand.
Additionally, we take the opportunity here to conduct analogous analysis of time elasticities. In addition, conducting these two forms of analysis also provides evidence on how the value of time (VOT) varies with journey distance. VOT is an important parameter in transport planning and one about which there is also much evidence from the disaggregate literature, but little evidence derived from aggregate data.
The analysis here is based upon tens of thousands of observations. The basic approach is to estimate fixedeffect pooled crosssection timeseries models but instead of the standard approach of estimating constant elasticities with respect to fare and time we allow the fare and time elasticities to vary with journey length in ways suggested by disaggregate and conventional transport planning analyses. This is done by specifying continuous functions which allow an increasing, diminishing or no effect from journey length, measured by distance, cost or journey time, on the elasticities, as empirically justified.
The following section of the paper discusses the formulation of the models to be used for analysis and how a nearequivalence can be obtained between the models of conventional transport demand analysis and those previously used for rail ticket sales data. Section 3 describes the data that is used and presents the results of analyses of this data using the various model forms proposed. Section 4 presents the summary results and gives a discussion and conclusions.
Formulation of hypotheses
While the data on which the main analysis of this paper is based comprises substantial numbers of observations, each of these observations contains relatively little detail about the travellers. Thus, analysis of this data yields very solid results with limited insight. The question to which the paper is then addressed is whether insights from other areas of travel analysis, in particular discrete choice, can be transferred to the estimation context of largescale ticket sales data. We focus particularly on Cost Damping.
Cost Damping
Cost Damping is the feature in some travel demand models whereby the sensitivity of the model to marginal changes in time and/or cost declines as journey lengths increase. It has been observed in a wide range of choice modelling contexts (for a review and discussion of potential causes, see Daly and Carrasco 2009) and is incorporated in many models used for practical transport planning in the United Kingdom. Recent changes to official Guidance (DfT 2014) have accepted that it may be necessary to include such variation in practical models of travel demand.
The DfT Guidance draws from a detailed study of Cost Damping in theory and practice. The report of that study (Daly 2010) found that effectively all of the practical urban regional, and national travel demand models that were in use in the UK were based on generalised cost functions; many used some form of Cost Damping. Daly (2010) suggested that cost or time budgets were not relevant to the phenomenon, but that heteroskedasticity in preferences is a likely cause. Four distinct types of Cost Damping were found to be used in practice to adapt the generalised cost functions, as illustrated in the following table adapted from the report.
The first classification, represented by the columns of the table, indicates whether the mechanism operates on the entire generalised cost function, or separately on the components of the function, e.g. the time and cost. The second classification distinguishes between transformations in which the variation over trip length is fixed, most often by relating it to the travel distance, and those in which it is a function of the travel variables only, most often a nonlinear function of time and cost. The report advises against the use of distance in such functions, as there is no behavioural basis for such use, but the practice is very convenient and therefore widespread.
With the GEV assumption for the distribution of the error term, this framework implies the prediction of demand by a logittype model, based on the representative utilities V.
Particularly for public transport applications, the term t above should be understood as ‘generalised journey time’ comprised of weighted components such as access time, waiting time and ‘linehaul’ time; to emphasise this point we use the notation GJT for this variable. A correction may be applied to account for crowding. Other variables may also be included in the generalised cost function to account for interchanges and other service aspects.
The term B may account for destinationspecific as well as modespecific aspects of choice. A normalisation will usually be required because it is not possible to identify a full set of such constants.
The term λ is implied to be constant across modes in the formulation above and indeed this constraint is often applied in practice. However, this practice may be criticised as it does not take account of different levels of comfort etc. that apply to the different modes. In the present paper, we are concerned purely with train journeys and the issue is not of concern.
Classification of costdamping mechanisms
Operating on  Entire function  Separate components 

Fixed function ‘differential scaling’  I  II 
Dependent on costs ‘transformed variables’  III  IV 
Interpretation for ticket sales data analyses
In the formulation presented, which is most common in European models (see e.g. Fox et al. 2003), destination choice is modelled as more sensitive to time and cost than mode choice.

the probabilities p _{ jm } and p _{ mj } are so small they can be neglected (or p _{ mj } is small and θ is close to 1) and

we neglect also any difference between the distributions of the error term.
Rail demand from one origin is distributed over many destinations, while rail typically has a small share of the total market from any origin and destination, so it does not seem unreasonable to expect the relevant probabilities to be small. However, for some origins (e.g. in the commuter belt of large cities) these assumptions may be less accurate. The difference between the assumptions concerning the error term are of less concern.
Therefore, it seems reasonable, at least as an approximation, to draw a close parallel between the ‘utility’ function V _{ mj } used in conventional planning and the function V _{ kt } used in rail data analysis. In particular we may test the cost damping functions derived for disaggregate modelling and conventional transport planning in the context of ticket sales data.
Model formulation for analysis
Model forms proposed
Type  Constraints  

0  log T = B + λ _{1} c + λ _{2} t  λ < 0 
I  log T = B + (λ _{1} c + λ _{2} t)d ^{−α }  λ < 0 0 < α < 1 
II  \(\log T = B + \lambda_{1} cd^{{  \alpha_{1} }} + \lambda_{2} td^{{  \alpha_{2} }}\)  λ < 0 0 < α < 1 
III  log T = B + λ _{0}(c + λ _{2} t)^{ α }  λ _{0} < 0, λ _{2} > 0 0 < α < 1 
IIIA  log T = B + λ _{0} log (c + λ _{2} t)  λ _{0} < 0, λ _{2} > 0 
IV  \(\log T = B + \lambda_{1} c^{{\alpha_{1} }} + \lambda_{2} t^{{\alpha_{2} }}\)  λ < 0 0 < α < 1 
IVA  log T = B + λ _{1} log c + λ _{2} log t  λ < 0 
IVB  \(\log T = B + \left( \begin{aligned} \lambda_{1} \alpha_{1} c + \lambda_{1} q_{1} \left( {1  \alpha_{1} } \right)\log \,c \hfill \\ + \lambda_{2} \alpha_{2} t + \lambda_{2} q_{2} \left( {1  \alpha_{2} } \right)\log \,t \hfill \\ \end{aligned} \right)\)  λ < 0 0 < α < 1 q constant > 0 
For models of types I to IV, the functional form for the cost damping function needs to be specified. There is a wide range of possibilities and among these we have chosen to work primarily with power functions. For Models of Type III and IV we additionally tested log functions, as set out in Table 2.
In Table 2, d indicates the distance between origin and destination, while λ and α are parameters to be estimated. B denotes effects specific to the origin and destination. t represents the generalised journey time GJT. Certain limits can be imposed on these parameters, for example that we would not easily accept models in which the sensitivity to time or cost increased with increasing trip length, as this is contrary to a large body of experience.
A further restriction on model form is the ‘kilometrage test’ presented by Daly (2010). This test is based on the commonsense notion that if the price of travelling per kilometre is increased one would not expect more kilometres to be travelled. It is not a strict economic argument, but general economic thinking suggests that it is a sensible requirement to impose on a model. In the context of rail data analysis the test should be slightly less rigorous, because the connection between price and distance is not at all precise. Nevertheless, it seems reasonable to impose on our models the requirement that an increase in price per distance will not cause the distance travelled to increase. Similarly, a reduced speed should not cause the distance travelled to increase and this test on time can be extended, admittedly with less precision, to the generalised time GJT. These tests impose limits on the functions we can use for cost damping.
It must always be the case that demand declines with increases in time and cost and this imposes the sign constraints on λ as shown in Table 2. The table also indicates the range of values of the parameters α that will give cost damping and maintain consistency with the kilometrage test. Normally we would not accept values outside these ranges. Additionally, for models IV and IVB, we should require that α _{1} ≤ α _{2}, so that VOT does not decrease with trip length. Model IVB interpolates between linear and log form.
For these models, elasticities with respect to time and cost can be derived, as follows.
Elasticities of proposed model forms
Type  Cost elasticity  Time elasticity  Value of Time ν 

0  λ _{1} c  λ _{2} t  \({\raise0.7ex\hbox{${\lambda_{2} }$} \!\mathord{\left/ {\vphantom {{\lambda_{2} } {\lambda_{1} }}}\right.\kern0pt} \!\lower0.7ex\hbox{${\lambda_{1} }$}}\) 
I  \({\raise0.7ex\hbox{${\lambda_{1} c}$} \!\mathord{\left/ {\vphantom {{\lambda_{1} c} {d^{\alpha } }}}\right.\kern0pt} \!\lower0.7ex\hbox{${d^{\alpha } }$}}\)  \({\raise0.7ex\hbox{${\lambda_{2} t}$} \!\mathord{\left/ {\vphantom {{\lambda_{2} t} {d^{\alpha } }}}\right.\kern0pt} \!\lower0.7ex\hbox{${d^{\alpha } }$}}\)  \({\raise0.7ex\hbox{${\lambda_{2} }$} \!\mathord{\left/ {\vphantom {{\lambda_{2} } {\lambda_{1} }}}\right.\kern0pt} \!\lower0.7ex\hbox{${\lambda_{1} }$}}\) 
II  \({\raise0.7ex\hbox{${\lambda_{1} c}$} \!\mathord{\left/ {\vphantom {{\lambda_{1} c} {d^{{\alpha_{1} }} }}}\right.\kern0pt} \!\lower0.7ex\hbox{${d^{{\alpha_{1} }} }$}}\)  \({\raise0.7ex\hbox{${\lambda_{2} t}$} \!\mathord{\left/ {\vphantom {{\lambda_{2} t} {d^{{\alpha_{2} }} }}}\right.\kern0pt} \!\lower0.7ex\hbox{${d^{{\alpha_{2} }} }$}}\)  \({\raise0.7ex\hbox{${\lambda_{2} }$} \!\mathord{\left/ {\vphantom {{\lambda_{2} } {\lambda_{1} }}}\right.\kern0pt} \!\lower0.7ex\hbox{${\lambda_{1} }$}}d^{{\alpha_{1}  \alpha_{2} }}\) 
III  αλ _{0} cG ^{ α−1}  αλ _{0} λ _{2} tG ^{ α−1}  λ _{2} 
IIIA  \({\raise0.7ex\hbox{${\lambda_{0} c}$} \!\mathord{\left/ {\vphantom {{\lambda_{0} c} G}}\right.\kern0pt} \!\lower0.7ex\hbox{$G$}}\)  \({\raise0.7ex\hbox{${\lambda_{0} \lambda_{2} t}$} \!\mathord{\left/ {\vphantom {{\lambda_{0} \lambda_{2} t} G}}\right.\kern0pt} \!\lower0.7ex\hbox{$G$}}\)  λ _{2} 
IV  \(\lambda_{1} \alpha_{1} c^{{\alpha_{1} }}\)  \(\lambda_{2} \alpha_{2} t^{{\alpha_{2} }}\)  \({\raise0.7ex\hbox{${\lambda_{2} \alpha_{2} t^{{\alpha_{2}  1}} }$} \!\mathord{\left/ {\vphantom {{\lambda_{2} \alpha_{2} t^{{\alpha_{2}  1}} } {\lambda_{1} \alpha_{1} c^{{\alpha_{1}  1}} }}}\right.\kern0pt} \!\lower0.7ex\hbox{${\lambda_{1} \alpha_{1} c^{{\alpha_{1}  1}} }$}}\) 
IVA  λ _{1}  λ _{2}  \({\raise0.7ex\hbox{${\lambda_{2} c}$} \!\mathord{\left/ {\vphantom {{\lambda_{2} c} {\lambda_{1} t}}}\right.\kern0pt} \!\lower0.7ex\hbox{${\lambda_{1} t}$}}\) 
IVB  \(\lambda_{1} \left( {\alpha_{1} c + q_{1} \left( {1  \alpha_{1} } \right)} \right)\)  λ _{2}(α _{2} t + q _{2}(1 − α _{2}))  \(\frac{{\lambda_{2} c\left( {\alpha_{2} t + q_{2} \left( {1  \alpha_{2} } \right)} \right)}}{{\lambda_{1} t\left( {\alpha_{1} c + q_{1} \left( {1  \alpha_{1} } \right)} \right)}}\) 
In Model IVA the elasticity is constant, so that this model corresponds to the classical rail demand analysis.
This function is continuous at α = 0, facilitating estimation in some cases. When appropriate, we have made this substitution.
Results of analyses based on these models are reported in Sect. 3.
The role of income
The sensitivity of demand to the cost of rail tickets depends, of course, on travellers’ incomes. Information is available on average incomes by region and by year and this data has been used to adjust the monetary variables entering the model. In every case, incomes and fares are corrected for inflation to the price level of 2005. However, adjustments have also been made for changes in real income levels.

First, the level of income affects the overall level of demand; as described below, this is accounted for by estimating a coefficient that links the demand to the income (adjusted for inflation) per capita for each region and year.

Second, the impact of fare is mitigated by income; that is, regions and years with higher incomes are more willing to accept higher fares. In the modelling, this is accommodated by multiplying the fares by the factor
In some cases it has been suggested that an ‘income elasticity’ should be applied, i.e. applying a power function to the income to adjust the fares. However, more recent work (Abrantes and Wardman 2011; Daly and Fox 2012 ^{2}) has found that an elasticity of 1 is more consistent with recent data. Moreover, selected tests made on the current data also indicated that unit elasticity was most appropriate. Subsequently, when we refer to fare we mean the fare adjusted by income in the direct way indicated above.
The income measure used in these analyses was the regional Gross Value Added per capita. We believe we are using the best available income data for this modelling, although the average income for an area may not represent the income of the potential railusing population. Income elasticity is not the main focus of this work, but is an essential point in understanding the generalised journey time (GJT) and fare elasticities which are our main focus.
Analysis results
Data chosen for analysis
Rail ticket sales data in Great Britain has, for many years, formed the basis of analysis of a wide range of impacts on rail demand. It records the number of sales between stations and, with some exceptions such as urban trips in metropolitan areas where the use of area wide travelcards are common, it is generally felt to provide a reasonably accurate reflection of rail demand.
To conduct this analysis, a number of data sets were available to us. The data selected relates to trips outside London and not wholly within the South East, and station pairs are limited to those separated by a distance of 20–300 miles. The London and South East trips are excluded since a wide range of tickets has historically been on offer for those journeys, while other flows tend to be on a smaller range of tickets with less competition between them. Hence the use of average revenue per trip as a measure of fare involved fewer approximations. We also excluded station pairs with distances under 20 miles where rail is often not an attractive transport mode and those over 300 miles where fewer trips are observed and air competition is a relevant issue.
By excluding London trips, there is less chance of one destination dominating choice, or of rail becoming the dominant mode. These two features make it easier to accept the approximate equivalence of model form between rail modelling and general transport planning discussed above, where it was important that the choice probabilities for the alternatives should be low.
The remaining data covers 3201 stationtostation movements for the years 1990–2005, excluding 1994 which was seriously affected by widespread industrial action, and all sales other than season tickets. Sales were aggregated for each stationtostation pair and each year.^{3} Pooling data across routes and over time yields 48,015 observations, each representing a station pair and a year, for modelling purposes.
Rail ticket sales data can make only limited distinctions by journey purpose, so there might be a concern that the cost damping is in part driven by journey purpose variations by distance. Our data covers nonseason tickets which, along with the distances involved, means that commuting will be a very minor proportion of trips. Although business travel with its lower price but higher time elasticities might be more prevalent for longer distance journeys, our NonLondon flows are, with the exception of a very few movements between key business centres, dominated by trips for other purposes. Whilst specific other purposes might be expected to vary by distance, such as holidays and short breaks forming a larger proportion of longer trips and visiting friends or relatives, entertainment and personal business trips forming a larger proportion of shorter distance trips, we have no evidence that these trip purpose shares or, more importantly, their elasticities vary strongly with distance.
Endogeneity of explanatory variables in econometric models can lead to biased coefficient estimates. The classic railway example is that a high frequency on a route can stimulate high demand but high frequency is often a result of high demand in order to accommodate the passengers. It is the former relationship that concerns us here but without separating the two effects it might be feared that we will obtain an exaggerated effect of the impact of service frequency. We do not though regard endogeneity here to be a particular issue for a number of reasons.
The conventional wisdom in Great Britain is that endogeneity is likely to be much more of an issue in crosssectional models, as in the example above, rather than in models as estimated here (with fixed effects controlling the crosssectional effects), where the variation informing the parameter estimates is the timeseries rather than crosssectional dimension. As far as we are aware, not since Jones and Nichols (1983) has endogeneity been raised as an issue in either the academic or practitioner literature. They were addressing concerns about previous research, largely crosssectional since sufficiently long timeseries were then not available, and they concluded that endogeneity was not likely to arise in studies with a strong timeseries element. Our view is that there are a number of compelling reasons why simultaneity bias is not a material problem in the models reported here.
Firstly, and very importantly, we are here analysing stationtostation data and it is not practical, or indeed customarily attempted (as is confirmed by the relevant industry body), for a railway company to price or offer a particular level of service between two stations that is closely based on the level of demand between those two stations. Fares and to a slightly lesser extent the service offering are largely corridor based, relating to distances travelled to avoid anomalies,^{4} on which there are very many stationtostation movements with very different demand levels. It would be hard to then claim that the price or service level offered on the flows in our regression analysis was strongly demand dependent.
Secondly, some fares in the UK have a historical basis and have been increased (roughly) in line with inflation or are constrained by the competitive situation which is outside the control of the operator. Indeed, some stationtostation movements, particularly in our data set, cover two or more operators. These features again break the endogenous link with demand. Whilst advance tickets are sold on a yield management basis, whereby greater demand leads to higher prices, this phenomenon tends to be more recent than our data set and in any event is much less prevalent on the NonLondon based flows investigated here.
Finally, all journeys in Britain have at least one fare whose level is regulated by government. This clearly weakens any endogeneity.
We therefore conclude endogeneity is not a serious problem, and in particular we see no reason why it has distorted the relationships between demand elasticities and distance which, in our models, depend on fare differences rather than absolute fare levels.
Classical rail models
The basic approach is to estimate fixedeffect pooled crosssection timeseries models, whereby a constant gives the basic magnitude of demand on each flow, linked to factors such as the population around the origin and destination stations and competition from other modes, with other variables explaining variations around this level of demand. Instead of the classical approach of estimating constant elasticities with respect to fare and time the models allow the fare and time elasticities to vary with journey length in ways suggested by disaggregate and conventional transport planning analyses. As described in the previous section, this is done by specifying continuous functions which allow increasing, diminishing or no effect with journey length, measured by distance, cost or journey time, on the elasticities.
The function f varies across the model types as described in Sect. 2.3. The variables appearing in this function are the following.
fare _{ ij } ^{ t } gives the fare between the stations ij in year t, derived as revenue per trip, in pounds sterling defined at 2005/6 price levels and adjusted by income as described in Sect. 2.4.
GJT _{ ij } ^{ t } gives the generalised journey time which is a measure in time units (minutes one way) of the timetablerelated service quality and comprises the origin to destination station journey time, service headway and any need to change trains. The GJT measure is standard in the UK rail industry and was supplied in its combined form.
dist _{ ij } represents the distance in miles from i to j.
The estimated parameters α, λ represent parameters related to the damping and parameters expressing the relative importance of fare and GJT.
This formulation includes dummy variables for the year of the Hatfield accident (2000) and for each the next 2 years to represent the disruption to services due to widespread speed restrictions and engineering works on the rail network. The impact of this work was not uniform across the network and it is not to be expected that an impact will be measured significantly, particularly in the simple way we have used, in all the models. Parameters β are estimated for this effect and, more importantly, for log income. The inclusion of these parameters is essential to avoid biasing the estimates of the key time and cost parameters of interest.
Parameters including 3201 values for constant _{ ij } are estimated by nonlinear least squares. The code is written in GAUSS (Aptech Systems, Inc. 2015) with some modifications from a code by Hill and Adkins (2001) to handle the large number of fixed effect constants. The leastsquares approach is equivalent to a maximum likelihood estimation, providing that the error terms are normally distributed, an assumption that is supported by the use of log trips as the leftside variable.
Fixed elasticity model (model IVA)
Variable  Estimate  t value 

λ _{1}  −0.4875  −53.56 
λ _{2}  −1.2193  −69.79 
β _{1} (log Income)  0.5292  54.18 
β _{2} (Hat2000)  −0.0357  −10.25 
β _{3} (Hat2001)  −0.0008  −0.22 
β _{4} (Hat2002)  0.0106  2.95 
Observations  48,015  
\(\hat{\sigma }^{2}\)  0.034676  
Log likelihood  14,235.26  
AIC  −22,054.5 
The result indicates that the data set provides a robust basis for the investigation that is here to be undertaken. Not only are the key parameters estimated with an extremely high level of precision, the elasticities are generally plausible and align reasonably well with the figures contained in the Passenger Demand Forecasting Handbook (PDFH) that contains the elasticities recommended for use in the railway industry in Great Britain (ATOC 2013). The GJT elasticity is very similar to PDFH’s value of −1.2, although the PDFH figure is explicitly long run. The recommended PDFH income elasticity is 1.4 between major cities and 0.65 for other flows. The income elasticity estimated here is smaller than the values that PDFH recommends, although not greatly so given the other flows dominate our data. The fare elasticity is lower than PDFH’s −1.2, but we should note that PDFH is intended to give long run fare elasticities and the static model of Table 4 will understate the long run effect.

the mean squared error \(\hat{\sigma }^{2} = SSE/\left( {R  h} \right)\), where R is the number of observations and h is the number of estimated parameters;

the log likelihood is calculated by (Spiess and Neumeyer, 2010)^{5} log(L) = 0.5(−R(log2π + 1 − logR + logSSE));
 the Akaike Information Coefficient (AIC) is calculated by$${\text{AIC}} = 2\left( {h + 1} \right)  2log\left( L \right)$$
The positive values for log likelihood are unusual, but arise from the calculation based on the assumption that the error term is normally distributed. All three of these statistics are measures of model quality and we can see from the formulae that for a good model the mean squared error and AIC should be small (less positive and more negative respectively), while the log likelihood should be large (more positive).
Undamped model
Undamped model (model 0)
Variable  Estimate  t value 

λ _{1}  −0.0045  −12.48 
λ _{2}  −0.0034  −44.81 
β _{1} (log Income)  0.8832  98.94 
β _{2} (Hat2000)  −0.0206  −5.64 
β _{3} (Hat2001)  0.0113  3.05 
β _{4} (Hat2002)  0.0164  4.37 
Observations  48,015  
\(\hat{\sigma }^{2}\)  0.038297  
Log likelihood  11,850.77  
AIC  −17,285.5 
Comparing the models of Tables 4 and 5, it is clear that the undamped model gives a substantially worse explanation of the data than the classical rail model and can be rejected. That is, the simplest transfer of a linear model is much worse for rail ticket data than the models used by rail data analysts.
Distancedamped models
Distancedamped models
Variable  Type I  Type II  

Estimate  t value  Estimate  t value  
λ _{1}  −4.7632  −14.73  −35.70  −12.52 
λ _{2}  −0.7292  −17.12  −0.1132  −9.69 
α  1.1111  79.22  n.a.  
α _{1}  n.a.  1.5424  79.33  
α _{2}  n.a.  0.6690  29.86  
β _{1} (log Income)  0.6835  73.00  0.6299  66.68 
β _{2} (Hat2000)  −0.0349  −9.83  −0.0322  −9.17 
β _{3} (Hat2001)  −0.0027  −0.74  0.0015  0.42 
β _{4} (Hat2002)  0.0046  1.25  0.0094  2.61 
Observations  48,015  48,015  
\(\hat{\sigma }^{2}\)  0.035993  0.035275  
Log likelihood  13,340.93  13,825.40  
AIC  −20,263.9  −21,230.8 
The fact that two of the α values are significantly outside the acceptable range (0, 1) to obtain cost damping and pass the kilometrage test may indicate that these formulations are simply not valid. However, the indication is that the α values are strongly positive and that (in Type II) fare damping is stronger than time damping, so that the value of time increases with trip length, as expected.
In addition to the unacceptable α values, the fit offered by these models is worse than the standard rail model, so these models do not offer any useable advance.
‘Dynamic’ functional forms
The final group of models estimated were those in which the damping operates as a function either of the generalised cost or of its components. Thus the damping is an intrinsic, ‘dynamic’ part of the model, rather than as an additional component as in the distance damping.
Dynamicallydamped models (entire function)
Variable  Type III^{a}  Type IIIA  

Estimate  t value  Estimate  t value  
λ _{0}  −9.3867  −74.66  −1.5352  −76.11 
λ _{2}  24.5461  35.22  0.2231  36.66 
α  0.08  Fixed  n.a.  
β _{1} (log Income)  0.6364  67.08  0.6216  65.52 
β _{2} (Hat2000)  −0.0327  −9.26  −0.0337  −9.57 
β _{3} (Hat2001)  0.0024  0.68  0.0011  0.32 
β _{4} (Hat2002)  0.0124  3.41  0.0112  3.08 
Observations  48,015  48,015  
\(\hat{\sigma }^{2}\)  0.035690  0.035508  
Log likelihood  13,543.67  13,666.50  
AIC  −20,671.3  −20,917.0 
These results indicate that the log function of IIIA outperforms the power functions of Type III. In fact, the log function is the limiting case of the Box–Cox when α → 0, so that the result and the difficulty in estimating positive values of α are not entirely surprising.
It can be seen that the model IIIA gives a fit to the data between those of Types I and II but not as good as that of the classical rail model reported in Sect. 3.2, which is very similar in form.
Dynamically damped models (power functions of components)
Variable  Type IV  Type IV fixed α  

Estimate  t value  Estimate  t value  
λ _{1}  −1.0  Fixed  −14.6403  −52.47 
λ _{2}  −1.0  Fixed  −34.7155  −69.51 
α _{1}  0.2006  64.29  0.03  Fixed 
α _{2}  0.2705  155.84  0.03  Fixed 
β _{1} (log Income)  0.6340  66.10  0.5360  54.83 
β _{2} (Hat2000)  −0.0313  −8.87  −0.0355  −10.18 
β _{3} (Hat2001)  0.0037  1.03  −0.0004  −0.12 
β _{4} (Hat2002)  0.0137  3.78  0.0109  3.04 
Observations  48,015  48,015  
\(\hat{\sigma }^{2}\)  0.035563  0.034773  
Log likelihood  13,629.10  14,168.40  
AIC  −20,842.2  −21,920.8 
The final model of this family that was tested was Model IVB, which represents a different approach to transforming time and cost.
Dynamically damped models (mixed log and linear functions)
Variable  Type IVB  

Estimate  t value  
λ _{1}  −0.1073  −63.55 
λ _{2}  −0.0289  −47.85 
α _{1}  −0.1801  −44.69 
α _{2}  −0.0216  −5.03 
β _{1} (log Income)  0.5109  52.74 
β _{2} (Hat2000)  −0.0331  −9.63 
β _{3} (Hat2001)  −0.0012  −0.35 
β _{4} (Hat2002)  −0.0091  2.57 
Observations  48,015  
\(\hat{\sigma }^{2}\)  0.033655  
Log likelihood  14,954.25  
AIC  −23,488.5 
Here we obtain a fit to the data which is substantially better than that of the classical rail models. That is, the cost damping given by the log function, in principle the strongest damping that can be applied in a smooth monotonic function, appears inadequate. However, the model presents an issue because of the sign of both α _{1} and α _{2}, which imply that for sufficiently large values of GJT or cost the slope of the function will be incorrect, i.e. that increases in time or cost would imply an increase in demand.
Discussion and conclusions
The objective of this study was to investigate whether insights and functional forms derived from conventional (road based) transport planning could be applied to the analysis of rail ticket sales data. This data has typically been analysed using constantelasticity models which, viewed from the standpoint of conventional transport planning, imply an extreme form of cost damping, i.e. that sensitivity to cost (and time) decreases as journey length increases. Cost damping has been observed in conventional planning work, but perhaps not to the same extent as in rail data analysis.
Initial investigation suggested that it was possible, at least as an approximation, to interpret the functional forms of conventional transport planning for use in rail data analysis. Functions were therefore set up that mirrored the four forms of cost damping identified in the work by Daly (2010), as well as a model without cost damping. It turned out that one of these four forms reproduced the fixedelasticity model of classical rail data analysis.
The first result of the analysis is that the existence of cost damping is strongly confirmed by this data. Very low values of exponents are found, close to zero, so that log functions are suggested. The second result is that damping of cost is stronger than damping of time, so that the value of time increases with trip length. Both of these results confirm findings from other models. They strengthen the case for including damping, if possible differentially by time and cost, in travel demand models generally. Omission of this effect could cause errors in forecasting, depending on the nature of the forecast scenarios being considered.
Estimating income elasticity and the impact of the Hatfield accident were not among the objectives of the study, but the values that were obtained in the better models were plausible.
Output of proposed model forms
Type  Fit (loglikelihood)  Range issues  Cost elasticity (at mean)  Time elasticity (at mean)  Value of time (p/minute, at mean) 

0  11,851  No  −0.09  −0.82  75.6 
I  13,341  α > 1  −0.39  −0.76  15.3 
II  13,825  α _{1} > 1  −0.35  −1.03  23.0 
III  13,544  α fixed  −0.37  −1.16  24.5 
IIIA  13,667  No  −0.40  −1.13  22.3 
IV  14,168  α fixed  −0.36  −1.19  26.0 
IVA  14,235  No  −0.49  −1.22  19.7 
IVB  14,954  α < 0  −0.52  −1.19  17.9 
Here we see that models of Type IV, with componentspecific ‘dynamic’ damping, fit the data better than the other models. Of the other models, Model 0 fits very poorly and has unacceptable cost elasticity and value of time results. Models I and II fail the kilometrage test. Model III requires α to be fixed but it and Model IIIA are primarily rejected for models of Type IV because of their inferior fit.
Within the models of Type IV, the base model IV is less good, because it requires us to fix α, it fits the data less well and the implied value of time is a little high. There is little to choose between models IVA and IVB in terms of the elasticity or implied value of time. Following the discussion of Sect. 3.2, the time elasticity value appears reasonable, comparing it with the PDFH equivalent; while these models are less elastic to cost than PDFH recommends, they are well within the broad range suggested by the UK Department for Transport (2015), which is −0.9 to −0.2. The value of time is also slightly higher than we might expect, suggesting that the effect of fares is not fully captured in these models. For this reason, and because travellers can switch between ticket types to mitigate the effect of fare increases, we do not recommend analysis of this type for the estimation of values of time.
In summary, this work has shown that it is possible to apply the functional forms used in conventional transport demand analysis to rail ticket sales data. When this is done, a high degree of cost damping is found, stronger than given by a log function. Further analyses should look at the methods available for representing this extreme form of damping.
Footnotes
 1.
Generalised ‘cost’ is conventionally quantified in time units.
 2.
These studies consider the impact of varying income on the value of time, which effectively converts cost variables to a time scale for use in modelling. Then an income elasticity of 1 for the value of time implies an income elasticity of the sensitivity of behaviour to cost also equal to 1.
 3.
Return tickets were included both for outbound and for return legs, but attributed to the origin station of the outbound leg as this links better to regional income.
 4.
In some further analysis we found a strong dependence of fares on distance and years, but little or no relationship to demand.
 5.
We report log likelihood rather than the sum of squares of errors because this permits the use of χ ^{2} tests for the significance of the difference of nested models.
Notes
Acknowledgments
Nobuhiro Sanko acknowledges support by JSPS KAKENHI Grant Number 24330132. An earlier version of this paper was presented at the hEART Conference, Leeds in September 2014. Comments by anonymous reviewers and the editor have helped us improve the paper, but we remain responsible for interpretations and any errors.
References
 Abrantes, P.A.L., Wardman, M.: Metaanalysis of UK values of time: an update. Transp Res A 45(1), 1–17 (2011)CrossRefGoogle Scholar
 Aptech Systems, Inc. GAUSS. http://www.aptech.com/ (2015). Accessed 13 Nov 2015
 Association of Train Operating Companies—ATOC: Passenger Demand Forecasting Handbook, Version 5.1. ATOC, London (2013)Google Scholar
 Daly, A.: Cost Damping in Travel Demand Models, Report for Department for Transport. http://webarchive.nationalarchives.gov.uk/20110202223908/, http://www.dft.gov.uk/pgr/economics/rdg/costdamping/ (2010)
 Daly, A., Carrasco, J.: The influence of trip length on marginal time and money values. In: Kitamura, R. et al. (eds.) The Expanding Sphere of Travel Behaviour Research: Selected Papers from the Proceedings of the 11th Conference on Travel Behaviour Research. Emerald Books (2009)Google Scholar
 Daly, A., Fox, J.: Forecasting mode and destination choice responses to income change. In: Presented at IATBR, Toronto (2012)Google Scholar
 Department for Transport: (2014) TAG Unit M2, Section 3.3 on Cost Damping: https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/275597/webtagtagunitm2variabledemandmodelling.pdf#nameddest=chptr03
 Department for Transport. TAG Unit M2, Section 6.4 on Realism Testing. https://www.gov.uk/government/publications/webtagtagunitm2variabledemandmodelling (2015)
 Fox, J., Daly, A., Gunn, H.: Review of RAND Europe’s Transport Demand Model Systems, RAND Europe Report MR 1694. http://www.rand.org/pubs/monograph_reports/MR1694.html (2003)
 Hill, C., Adkins, L.: Using Gauss for Econometrics. http://pages.suddenlink.net/ladkins/pdf/GAUSS.pdf (2001). Accessed 4 Aug 2013
 Jones, I.S., Nichols, A.J.: The demand for intercity rail travel in the United Kingdom: some evidence. J. Transp. Econ. Policy 17(2), 133–153 (1983)Google Scholar
 Rich, J., Mabit, S.L.: Box–Cox approximations and beyond. In: Presented to hEART Conference, Stockholm (2013)Google Scholar
 Spiess, A.N., Neumeyer, N.: An evaluation of R^{2} as an inadequate measure for nonlinear models in pharmacological and biochemical research: a Monte Carlo approach. BMC Pharmacol. (2010). doi: 10.1186/14712210106
 Wardman, M.: Price elasticities of surface travel demand: a metaanalysis of UK evidence. J. Transp. Econ. Policy 48(3), 367–384 (2014)Google Scholar
Copyright information
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.