Participation in online activities while travelling: an application of the MDCEV model in the context of rail travel

Travel-based multitasking, i.e. using travel time to conduct enjoyable and/or productive activities, is the subject of an increasing number of theoretical and empirical studies. Most existing studies focus on modelling the choice of which activities people conduct while travelling, and a limited number of papers also focuses on their duration. The novelty of this study with respect to this literature is two-fold. Firstly, we specifically study the engagement in different online activities while travelling, and apply the state-of-the-art Multiple Discrete-Continuous Extreme Value (MDCEV) model to jointly model the choice and duration of multiple activities. We apply this model to data collected face-to-face from train passengers in the UK. We find that activity choice and duration is explained by both passenger and trip characteristics, especially trip purpose, ticket type and day/time of the trip. Secondly, we show how such modelling can assist in investment appraisal, in particular by providing insights into lower- and upper- bound estimates of the proportion of the entire travel time spent working, itself of importance in, for example, valuation of business travel time using the so-called Hensher Equation. We present a detailed discussion of how the findings from our work contribute to the broader discourse around the nature of travel time and its valuation.


Introduction
The perception of travel time has been undergoing a revolution: away from that of onerous and wasted and towards an opportunity to undertake enjoyable and/or productive (in an economic sense) activities. This phenomenon, variously referred to as travel-based 1 3 multitasking or travel time use, has been the subject of a number of theoretical and empirical studies. These have definitively made obsolete the long-standing assumption of 'unusable and wasted' travel time, originally and implicitly motivated by the perspective of a private vehicle driver whose entire attention is focused on driving. This assumption has nevertheless underpinned most transport models in use today, which at best capture travelbased multitasking implicitly, e.g. as sensitivity to travel duration. This implicit treatment is, however, becoming increasingly inadequate to assist in guiding policies such as those concerning connected and autonomous vehicles (CAV), investments in connectivity for travel modes (including 5G connectivity) or design of rolling stock and airplane facilities to support mobile ICT use, including tables, power sockets or on-board meeting spaces (compartments).
More fundamentally, a number of recent studies have explicitly linked the ability to conduct activities while travelling to changes in the value of travel time (Wardman et al. 2019;Molin et al. 2020). Such evidence helps build a case for the inclusion of such considerations in frameworks relying on the concepts of value of travel time (VTT) and value of (business) travel time savings, or V(B)TTS, 1 including mode choice models or investment appraisal.
The explicit modelling of travel time use has emerged only in the past 15 years, as shown in recent systematic reviews on the topic (Keseru and Macharis 2018;Pawlak 2020). A notable exception was the work by Hensher (1977) in the context of business air travellers. Through its subsequent extension and formalisation (Fowkes et al. 1986;Batley 2015), it led to emergence of an approach to valuation of travel time savings, the so-called Hensher Equation (HE), that does account for the use of travel time for work-related activities. Despite its apparent attractiveness, it has been employed in only a limited number of cases and typically in a restricted form due to the difficulty of fully populating the model with data (Wardman et al. 2019). In addition, the HE is focussed upon business travellers and their productivity, neglecting the typically larger segment of non-business passengers.
The resurgence of interest in the productive use of travel time has coincided with the proliferation of mobile ICT (Lyons and Urry 2005), as shown in the recent systematic reviews (Keseru and Macharis 2018;Pawlak 2020). This is in spite of the fact that travellers (not necessarily private vehicle drivers) have always spent their journeys not being idle. It has been argued, however, that mobile ICT have been particularly disruptive through enabling the flexibility to participate in a variety of digital activities while travelling. However, digital (or ICT-enabled) activities increasingly rely not only on access to connectivity, but also on sufficient bandwidth, larger amounts of data as well as reliability, to deliver an enjoyable and secure online experience. A simple illustration of this trend is the growth in data consumption per user, which has been consistently reported across countries (Fig. 1).
Despite the proliferation of online activities in the course of travel, studies modelling this phenomenon in detail (especially when it comes to the analysis of activity duration) have been comparatively rare. As shown in the next section, no study has attempted to model travel time allocation between online activities jointly with the use of the MDCEV approach. Yet this approach appears promising for two reasons. Firstly, it has the ability to infer determinants of time allocation, both in terms of activity choice and duration.
Secondly, it also provides a means of forecasting policy scenarios, thereby generating inputs for appraisal frameworks.

Objectives of the present paper
The aim of the present paper is to extend the understanding of the determinants of choice of and allocation of time to online activities, with a specific focus on the choices between online work as opposed to other activities conducted online. We do this by applying the MDCEV model in the context of rail travel. The outcomes of the study concern the quantification of the effects of journey and respondent attributes on the i) propensity to participate in online work and non-work activities (discrete decision); ii) the amount of travel time allocated to each of these activities (continuous decision). In addition, the paper presents an application of the MDCEV framework to forecasting time use while travelling, given the journey and respondent attributes. Such an application is shown to be of use in the context of transport investment appraisal methodologies, especially in relation to the Hensher Equation. Outline of the paper This paper is structured in six sections. Section "Literature review" presents a brief review of the existing literature on travel-based multitasking, especially in relation to the value of travel time. Section "Data" describes the data collection process and the main features of the sample used in this paper. Section "Modelling framework" introduces the MDCEV modelling framework and its application in the present context, including its estimation principles and application to forecasting. Section "Findings" presents the findings and Section "Conclusions" discusses the insights gained within the broader discourse concerning the nature of travel time. Lavieri et al. (2018) classified multitasking, including travel-based, among the six main impacts of virtual activities on activities and travel, extending the classical typology of the relationships between ICT (earlier telecommunications) and travel (Salomon 1986;Mokhtarian 1990). A more detailed theoretical discussion concerning multitasking and its relationship to ICT and travel is available from Circella et al. (2012) and Kenyon and Lyons (2007), while below we provide a brief account of the various discourses towards which this study contributes.

Literature review
In terms of the explicit modelling of travel time allocation, two recent theoretical contributions came from Pawlak et al. (2017) and Pudāne et al. (2018). The former presents an explicit treatment of an arbitrary number of activities while travelling. Drawing upon the earlier microeconomic models by Pawlak et al. (2015) and Winston (1986), the authors presented a hazard-based model which considers multiple periods of interweaving activities, and elicits overall time allocations for the various activities. In their empirical example, the authors showed an application of the framework to modelling work and non-work activity as well as productivity while travelling by train. Their model is, however, rather demanding in terms of data, requiring information on the duration and ordering of the periods of particular activities, making it potentially more difficult to operationalise than direct time allocation models. Pudāne et al. (2018) proposed a model for the context of (fully) automated vehicles which would allow passengers to engage in activities other than driving. The authors provided a number of contexts in which full vehicle automation may result in changes to daily time use, including location of the activities, with the potential to re-time some activities such that they could be undertaken in transit. However, the authors did not propose any explicit econometric operationalisation of their framework. The range of possible scenarios involving travel-based multi-tasking justifies the further development of tools to model travel time use, a research objective that the present paper seeks to contribute toward. At the same time, empirical studies concerning time use while travelling have tended to concentrate on modelling the discrete decisions concerning activity choices (e.g. when to depart, to what destination, by what mode, by which route). Only a handful of studies have attempted to model the choice and duration of activities while travelling. These have employed techniques such as cluster analysis (Timmermans and Van der Waerden, 2008), skewed logit (Zhang and Timmermans 2010), ordered logit (Wang and Loo, 2018), latent class binary logit (Shamshuripour et al., 2020), panel effects regression (Rasouli and Timmermans 2014), log-linear models (Pawlak et al. 2016) and discrete-choice and hazardbased models linked with a copula (Pawlak et al. 2017). Most recently, Varghese and Jana (2019) applied the Multiple Discrete-Continuous Extreme Value (MDCEV) model, though their study did not explicitly distinguish between activities undertaken online and offline. Pawlak et al. (2020) made use of the MDCEV utility function specification to formulate and estimate a time use and goods consumption model that explicitly reflects energy use and takes into account travel-based multi-tasking. Rasouli and Timmermans (2014) used a panel effects regression model to look at the duration of working using the internet, but did not consider the non-work online counterpart. To the best of our knowledge, however, no study has made use of the MDCEV approach to model travel time allocation between online activities. Systematic reviews of such studies are available from Keseru and Macharis (2018) and Pawlak (2020). As for the link between travel-based multitasking and value of travel time, it is worth observing that a number of studies such as Malokin et al. (2019) have postulated that the activities undertaken while travelling can have impact on mode choice decisions. This coincides with the findings of (Wardman et al. 2019;Molin et al. 2020), in relation to downrating of the value of travel time (VoT) savings (as the value of time spent travelling increases), especially through the digital age in which mobile ICTs decouple activities from their previous particular spatial and temporal contexts, i.e. association with specific places or times. While the prior research by Ettema and Verschuren (2007) reports an instance in which multitasking individuals are characterised by higher VoT, it appears to be related to strong task-orientation and possibly higher underlying VoT. Overall, this implies that advancing analyses of travel-based multitasking, including those concerning the role of online activities, may yield insights into a set of drivers of travel satisfaction and productivity. Those indicators may in turn affect modal choices, themselves among the most important behaviours in transport planning and policy-making (Pawlak 2020). In particular, policies seeking to facilitate the use of travel time in the context of particular modes, may make such modes more competitive against others. The present study paves the way towards more explicit understanding and measurement of factors that shape travel time use, especially those concerning digital (online) activities, whose popularity has been observed to have grown systematically over the past decade (cf. Lyons et al., 2016). Whilst past studies have focused strongly on work-related activities and the notion of productivity, it is worth observing that the growth in data consumption has been driven much more substantially by non-work activities, such as video and media streaming (ITU 2015). In fact, this trend is expected to continue, fuelled by the growing popularity of on-demand video and media services (Netflix, Amazon Prime, Facebook Watch, Apple TV, Spotify, Tindall, etc.) and the ever-increasing quality of those media, such as High Definition (HD) or 4 K video. These trends point towards a need to better understand and forecast online activity during travel, acknowledging that high-speed and reliable connectivity tends to be more challenging than for fixed locations (Pawlak 2020).
Secondly, the present study has specific relevance for the estimation of travel time savings. In particular, the ability to understand how much time spent online is devoted to online work and non-work activities can provide lower-and upper-bound estimates of the proportion of the entire travel time spent working. This parameter, as we will describe in the Findings section of this paper, plays an important role in the so-called Hensher Equation (Hensher 1977;Batley 2015;Wardman and Lyons 2016), an approach to valuing business travel time savings with the explicit account of travel time use and productivity. Wardman and Lyons (2016) provide a summary of empirical studies that provided estimates, based on survey data, of the relevant parameters of the Hensher Equation. However, we observe that data concerning time spent online (but not necessarily what it was spent on, due to avoid privacy concerns) can be relatively easily collected from the relevant IT infrastructures. It follows that the proposed modelling approach offers a step towards a more representative-and possibly more updateable-approach to estimating one of the key parameters driving the value of travel time savings.
Lastly, the ability to develop forecasts concerning travel time use can supplement existing travel information attributes, and potentially take account of personalised requirements. Under such circumstances, a user could make travel-related decisions based not only upon the traditional attributes of travel alternatives (trip duration, cost), but also the expected profile of time use and productivity while travelling. In this context, the study can be seen as contributing to the body of research that links travelbased multitasking and subjective well-being, e.g. Ettema et al. (2012) and a recent discussion and overview by Mokhtarian (2019), enabling the travellers to make conscious and most satisfactory travel decisions. This would arguably complement the notion of mobility 'servicisation', i.e. shift towards meeting mobility needs through services as opposed to ownership of modes, which to date has tended to focus on interoperability between the modes, with the issue of travel time use and productivity being largely neglected.

Survey protocol
Project SWIFT (Superfast Wi-Fi In-carriage for Future Travel) was an industrial research project to implement and trial an alternative to mobile network connectivity in trains, using a dedicated trackside wireless infrastructure connected to trackside optic fibre (NIC, 2016). Funded jointly by Innovate UK (the UK's Innovation Agency), the UK Rail Safety and Standards Board (RSSB), Cisco Systems and Abellio ScotRail between 2016 and 2018, the project sought to trial the technology on a route from Glasgow Queen Street to Edinburgh Waverley via Falkirk, in Scotland. The aim was to demonstrate the feasibility of achieving consistent and reliable backhaul of initially 100Mbps, later increased to 300 Mbps, to a train moving at speeds of 100 mph. This would effectively translate to increasing the backhaul by an order of magnitude, as compared to the typical 3/4G-based backhaul.
In order to better understand the implications for passenger use of travel time, especially in relation to digital online activities, and the consequences for satisfaction, productivity and perception of travel, the project originally intended to conduct pre-(baseline) and postimplementation surveys. However, due to changes in the project scope, such that improved connectivity was implemented only on selective sections of the route and made available to passengers only on suitably equipped trains, only the baseline survey was in practice conducted. Initially, a single wave of the baseline survey was planned, which took place in late October and early November 2017. Due to the introduction of new rolling stock on the route, however, it was deemed necessary to repeat the survey to account for possible systematic effects on passenger satisfaction of the new rolling stock as distinct from enhanced mobile connectivity. The second wave was thus conducted in September 2018. The second wave also permitted revision of the screening and background questions, reducing the number of responses that had to be otherwise removed due to inconsistencies in the responses. The questionnaire included 20 questions covering various aspects of passenger experience, a subset of which is analysed here: • Journey context: origin, destination, time of day, frequency of travel, ticket type; • Internet use: access devices, types of online activities/services engaged in while travelling: online banking; cloud-based storage; e-mail; journey planning and maps; messaging; news; online calling; shopping; social media, dating and blogs; video or music streaming; games; Virtual Private Network (VPN) and an open category ('other') in which the respondent could state further activities not captured under the other activity types; 1 3 • Connectivity: access mode (on-board Wi-Fi, mobile network), satisfaction with the connection speed and reliability (very satisfied/fairly satisfied/neither satisfied nor dissatisfied/fairly dissatisfied/very dissatisfied); • Time use: expected and actual allocation of time to the internet use for work, for leisure, and non-internet activities; • Productivity as compared to typical office conditions; • Passenger profile: gender, age, work status, companion; • Willingness to undertake the same journey in the future using ScotRail.
An attraction of the Glasgow-Falkirk-Edinburgh corridor as a case study is that it tends to operate within capacity, i.e. does not get overcrowded under normal (non-disrupted) conditions. Hence, this corridor should reflect typical passenger behavior free of any confounding effects that crowding might have on the productive use of travel time.
The survey was administered by fieldworkers, who travelled on the route and approached passengers to invite their participation in the study. The individuals were screened in terms of whether they had already used, or would use, connectivity during their trip. Only passengers older than 16 years of age were invited to take part. If passengers agreed to participate, they could either fill in the questionnaire on the spot and return to the fieldworker, complete the questionnaire and return by post, or use the web versions of the questionnaire (in which case they were given a web link). The surveys were conducted on both weekdays and weekends, during morning, afternoon and early evening hours. The final sample sizes were 338 and 717 for the 2017 and 2018 waves respectively.
In what follows, the paper will focus on the allocation of travel time between three generic activities, namely: • using the internet for work purposes ('work online'), • using the internet for leisure purposes ('leisure online'), • not using the internet (offline).
Unfortunately, the survey does not offer breakdown of the offline activities between work and non-work, given its primary focus on the online component of travel time. This limits applicability of the data in operationalisation of the Hensher Equation, in particular provision of a point estimate of proportion of travel time (online and offline) spent working, i.e. the term p in the HE. Nevertherless, in Section "Relationship to Hensher equation and travel time valuation" we discuss how even this restricted information is sufficient to establish upper and lower bounds on this parameter, thus offering opportunities for using information regarding time spent online (harvested through a survey or other mechanism) for modelling and appraisal purposes.

Sample characteristics and data processing
Responses from 1055 travellers were collected in the two waves. We excluded from the analysis respondents whose time use data (our dependent variables) were missing and participants who reported not to have with them any device with an internet connection. We also excluded two respondents who reported a trip duration of 23 h, which is not realistic given the distance covered by the route where the survey was conducted. The resulting distribution of trip duration is displayed in Fig. 2, showing that the majority of the trips lasted less than 60 min.
Finally, we analysed the duration of each activity and, if an activity lasted less than one minute, we replaced this with zero. This was an artefact generated by the data collection process, i.e. asking people what share of their trip was dedicated to each activity. It became clear that some people had stated shares such as 1% of a relatively short trip, and we concluded that this was not a reliable measure of their time allocation.
The sample size after data processing comprised 950 individuals. The socio-demographic characteristics collected in the survey are reported in Table 1, which shows a good representation across genders, age groups and employment status, which also included self-employed and students. The table also shows the use of ICT devices while travelling.
As explained in Section "Survey protocol", respondents reported the activities they engaged in and for how long. Table 2 shows that 82% of the sample were not using the internet connection for at least part of their trip. The average amount of time spent offline   Figure 3 shows the distribution of the activity duration across the sample, providing a more complete behavioural picture.
The travel time spent "offline", which we use as a base category in our model, can represent a range of activities (including offline work and leisure), but this breakdown was not collected, as the project focused on time use for online work and leisure versus other activities. Nevertheless, the literature in the field can help in the interpretation of this "offline" category of activities. Lyons et al. (2016) use data from the National Rail Passenger Survey (NRPS) to understand rail travel time use in an era of increasing digital activities and slightly decreasing car use in the UK.  Table 1.a to include only activities which are performed offline. Using data from 2014 (sample size = 27,812), it shows that about 44% of respondents spend at least some time window gazing/people watching, followed by reading for leisure and texting/calling friends and family members.

Modelling framework
The case for a discrete-continuous model As mentioned above, respondents reported the activities they engaged in during their rail trip as well as their duration. The fact that each traveller could engage in more than one activity makes the use of traditional choice models inadequate, as it would violate the assumption of choice alternatives being mutually exclusive. Moreover, in reality the choice of activities is not independent of their duration. For this reason, we adopt a multiple discrete-continuous model which jointly accommodates the choice between alternatives that are not mutually exclusive as well as their duration. By doing so, we gain a more complete picture of the real-world behavioural process.

The MDCEV model
The multiple discrete-continuous extreme value (MDCEV) model was first proposed by Bhat (2005) and later extended in different directions (Bhat 2008;Pinjari and Bhat 2010;Castro et al. 2012). It represents the state of the art in modelling multiple discrete-continuous choices. The model and its extensions (accommodating mixed parameters, nesting and multiple budgets) were applied in several empirical contexts in the study of travel behaviour. Common applications are to the choice of vehicle type and mileage (e.g. Bhat and Sen 2006) and the type or timing and duration of activities (e.g. Srinivasan and Bhat 2005). Derived coherently with Random Utility Maximisation (RUM) theory, this model differs from traditional choice models as it allows agents to choose more than one option, relaxing the assumption of alternatives being mutually exclusive. The additive but non-linear utility formulation ensures that consumption of one good does not affect the utility of the others, and admits the possibility that these goods are imperfect substitutes. Thanks to its non-linear specification, the MDCEV model also allows for diminishing marginal returns, so that analysts can estimate the satiation experienced from each good. Bhat (2008) proposes the utility function specification as follows: where is U(x) is a quasi-concave, increasing and continuously differentiable function with respect to the vector of time allocations x k . k , parameterised as k = exp ( � z k + k ), representing the baseline utility of activity k. It is a function of characteristics of the decision maker and of the alternative ( z k ), and includes a constant to capture the generic preference for option k. The coefficient k exponentiates the amounts consumed and therefore is interpreted as a "pure" satiation parameter. Furthermore, the translation parameter k serves to admit the possibility of corner solutions, i.e. observations in which zero time is allocated to activity k. Since both k and k capture aspects of satiation, Bhat (2008) proposes three different utility specifications which seek to overcome any resulting identification issues; we will talk about this in Section "Implementation for rail travel time use". (1) An extreme-value error term is introduced to k in a multiplicative fashion. When estimating the model, the analyst solves the problem of optimal allocation of time with respect to the amount of time invested in the K activities, as follows: where t * k are the optimal amounts of time invested in each activity k such that the time budget E is exhausted. As explained in Bhat (2008), the problem above can be solved by forming the Lagrangian and applying the Khun-Tucker conditions. This procedure results in the probability expression for the time allocation pattern, where M activities are performed: where is a scale parameter, not estimated in our case as there is no price variation across products; The probability expression above can be expressed either in terms of "amount consumed" and/or "expenditure" of the resources making up the budget. In our case, all alternatives (i.e. different activities undertaken whilst travelling) have the same price, i.e. one minute is worth the same for all alternatives, such that the two forms are interchangeable. For further details about the model, see Bhat (2008).

Implementation for rail travel time use
The present paper aims to jointly model the activity choice of rail passengers as well as the time invested in each of the chosen activities. As mentioned above, the survey allows us to identify three activities: • Work online • Leisure online • Offline We derived duration of the trip from the reported trip start time, origin and destinations and data on departures from and arrivals to stations for specific services provided by the rail operator. We used the duration in minutes derived thereof as a budget. As durations may vary not only by origin/destination combination but also by small deviations from the scheduled times, the resulting budgets are different for each respondent in the sample.
Different specifications of the model are also needed depending on whether there is an outside good, i.e. an activity that is chosen at least once by everyone in the sample (cf. Bhat 2008). As shown in Table 2, each of the three activities presents corner solutions (i.e. there is at least a person who does not perform an activity, otherwise the percentage would have been 100%) meaning that no activity qualifies as an outside good in this context.
The different "utility profiles" proposed by Bhat (2008) to limit the empirical identification issues related to k and k were tested on our base model, and a comparison of the model fit led us to choose the 'alpha-gamma' profile, and as a consequence we estimate three k coefficients (one for each activity) and a single coefficient which does not vary across activities. In all the specifications that we estimated, had a very small value and it was not significantly different from zero ( → 0 ). Fixing to zero implies that the utility function collapses to a log-formulation: This formulation entails that additional units of time spent result in diminishing returns, as utility increases in a logarithmic fashion.
In the present application, we introduce participants' socio-demographic characteristics as well as trip-specific characteristic (as in our dataset each participant reports information about one trip) in the discrete part of the model through k , where the positive effect of a variable increases the probability of performing that activity. In addition to exploring the measurable determinants of the choice between activities, we also parameterise the continuous part of the model, through the k parameters. In particular, we specify the satiation parameter of an activity as follows: = base + 1 * z 1 + … + Q * z Q (where the activityspecific subscript is removed for notational brevity), and we assess the impact of Q variables on the value of . As in the discrete case, a positive value of the shift 1 implies that the z 1 variable is associated with a higher value of . This corresponds to less rapid satiation and therefore higher willingness of the individual to invest their time in that activity, ceteris paribus. The influence of many different variables and interactions was tested on both the discrete and continuous parts of the model. In our final model, we only retain the ones having a significant effect and give an overwiew of those that were excluded.

Relationship to Hensher equation and travel time valuation
In section "Literature review", we discussed how the discipline of travel-based multitasking (travel time use) has been flourishing in the past two decades. While these studies have added ideas and insights to the field, probably the key theoretical contributions in explicit modelling of travel time use remains the so-called Hensher equation (HE), proposed by Hensher (1977), formalised by Fowkes et al. (1986) and more recently derived from first microeconomic principles by Batley (2015). The equation explicitly links the value of business travel time savings ( VBTTS ) to the considerations about travel time use and productivity, and is typically formalised as: where p is the proportion of business travel time saved that would have been spent working; q is productivity of working whilst travelling relative to at the workplace; r is the proportion of business travel time saved that is allocated to leisure; MPL is the value of the marginal product of labour (often wage is taken as a proxy); MPF is the value of extra output due to reduced travel fatigue (in practice often omitted due to difficulties in estimation); VW is the difference between the employee's valuations of 'contracted' work time The framework focuses upon activities that occur during the travel time saved (reduced) rather than across the entirety of the journey, although Batley (2015) highlighted this distinction and explicitly dealt with it in the course of his derivation. That notwithstanding, values of p , q and r are usually approximated by those observed across the entire journey (Wardman and Lyons 2016;Pawlak et al. 2014), acknowledging that this assumption may not hold in practice, e.g. DfT (2009).
In the context of the present paper, we argue that the proposed MDCEV methodology can serve as an additional mechanism for modelling and forecasting upper and lower boundaries of p . In particular, note that: where W ON is travel time online spent working; L ON is travel time online spent not working; W OFF is travel time offline spent working; L OFF is travel time offline spent not working; W is travel time spent working (online and offline); L is travel time spent not working (online and offline); T OFF is total travel time spent online; T OFF is total travel time spent offline; is journey duration.
Equation 7 can be equivalently expressed as: where denotes the proportion of offline travel time devoted to work. Hence, using Eqs. 8 and 9: Hence: Observing from Eq. 6 that T ON + T OFF = , this can be rewritten as: This in turn leads to the following, based on Eq. 7: In addition, based on Eq. 7 and observing that times need to be non-negative, it is possible to establish: Thus it is possible to establish, using Eqs. 13 and 14, and dividing by the total journey duration : ON is the proportion of total travel time online spent working; p * is the proportion of the total travel time spent working (online and offline); l * ON is the proportion of the total travel online spent not working.
It therefore follows from Eq. 15 that to establish lower and upper bounds of p * it is sufficient to observe and model time allocations to online work and online non-work. Equation 15 is also intuitive as it reflects that the total amount of time spent working must be greater than or equal to online work (as it can also include offline work component), but less than the total journey time net of time spent online on non-work activities. If work is conducted entirely online, the left weak inequality becomes an equality, and it is possible to estimate the value of p * exactly. Analogously, if all non-work time is conducted online, the right inequality becomes an equality, allowing exact estimation of p * . Conversely, if work and non-work is conducted entirely offline, the left-most and right-most terms in Eq. 15 tend to 0 and 1 respectively, yielding no additional information on the size of p * . Hence the proposed model provides a way to estimate a proxy for the p term in the HE using data concerning participation in digital activities. Such harnessing is increasingly possible in an automated manner, by analysing traffic on the network infrastructure for specific 'fingerprints' that particular activities can have, as demonstrated for example by Li et al. (2017), though with a caveat of such approaches being debated in relation to intrusiveness and privacy (for more discussion, see Ghaleb 2016).

Model estimation results
Our models were estimated in R (R Core Team 2018), using the choice modelling package Apollo (Hess and Palma 2019). The results of our final model estimation are reported in Table 3, where for each coefficient we report the estimate followed by the robust t-ratio in brackets. The specification presented here includes the parameters that were found to have a significant effect on the outcome variables, as well as a limited number of coefficients which were not significant but displayed an intuitive sign.
Our first observation is that the choice of activities as well as their durations are indeed shaped by attributes of the respondent and the journey. This confirms the general observation from the literature (Keseru and Macharis 2018;Pawlak, 2019) that travel-based multitasking is a highly hetereogenous phenomeon. The first set of variables in the table represent the activity-specific constants in the discrete part of the model, where offline is taken as the reference activity. These parameters do not carry an obvious interpretation because they refer to the baseline observations and because of the lack of a clear distinction between the discrete and continuous parts of the model (cf. Bhat 2008). In our baseline model (without socio-demographic variables) these δ parameters reflect the split of activity choice observed in Table 2.
The age of participants has an effect on their engagement in online activities, with younger people (age < 25) displaying a lower probability of working and a higher probability of performing online leisure activities vis-à-vis passengers aged 25-45, while passengers aged above 45 are less likely to engage in leisure. This finding is intuitive, as across various domains younger people show a greater propensity to participate in online activities.
Passengers who are self-employed are found to be less likely to engage in leisure activities, while retirees are less likely to work while travelling-both findings accord with intuition.
Passengers with a first class ticket are found to be more likely to engage in online work and leisure activities and less likely to stay offline-as compared to other ticket holders. This could be explained by the greater comfort of first class, better wifi connectivity and/ or by a greater predisposition of first class travellers (generally characterised by more business stravel and higher incomes) towards online activity. No other variable related to ticket type was found to be significant, but through correlation analysis between ticket type and trip purpose we observed that the majority of season ticket holders were commuters (which stands to reason). We believe that some of these effects are captured by the variable identifying trip purpose.
Indeed, several significant coefficients show the relevance of trip purpose as an explanator: daily commuting, business and education trips are associated with a higher likelihood of working online as compared with leisure trips. Business trips are also associated with a lower likelihood of performing online leisure activities, with passengers more likely to be performing online work followed by offline activities-again, this stands to reason, given that business passengers are generally travelling within their employers' time. Those who travel during the morning peak hour (6 AM-10 AM, on weekdays) are more likely to engage in work activities online, making productive use of an early morning business trip and/or the morning commute -the latter is interesting in that the commute travel is not generally regarded as the employer's time. We also find, somewhat surprisingly, that during weekend trips, the likelihood of engaging in working online is higher than staying offline and performing leisure activities online. This perhaps reflects the increased 'blurring' of work and leisure, a phenomenon which has emerged with the widespeard adoption of mobile devices.
We also considered the frequency of travelling along the given route, and found that those with a "medium" travel frequency (between once a week and once a month) are more likely to work online than those who travel more or less frequently.
People who travel with another adult are less likely to work online while travelling, possibly as people travelling together might prefer to chat rather than looking at their electronic devices. Participants were asked how much they relied on an internet connection for carrying out their activities on the train. As expected, we find that people who were more reliant on the connection are also more likely to work during their trip.
The second part of the table reports the translation parameters. The k parameters reflect the satiation of the baseline category in the model, which in this case corresponds to people who travel alone with a ticket other than first class. Within this group, the baseline parameters show that the online work and offline activities result in lower satiation (and therefore engagement for a longer time) than online leisure. Interestingly, this result is not in line with Rasouli and Timmermans (2014), who report from the Netherlands that respondents' engagement in online work activities is not longlasting. Cultural and sociodemographic differences between the sample used in Rasouli and Timmermans (2014) and the one for the present study might play a role in the difference in findings. Of course, five years is a long time in the internet era, and some behavioural patterns might have evolved since 2014.
We also find that when the passenger is travelling in first class, the time spent working online is lower than in the case of those travelling with other tickets-although the differential is very small. As some evidence has suggested (Brown and O'Hara 2003), this could be motivated by the fact that during business or other kinds of important trips, travellers make arrangements expecting poor connection (for example) and are thus better prepared to work offline. Furthermore, when people engage in online leisure activities, these last longer if the trip is daily commute or business. Finally, if people travelling with one other adult engage in online leisure, they do so for a shorter time than they would do when travelling alone.
On top of the person and trip-level characteristics described above, we tested the effect of several interactions between the individual variables but excluded them from the final model as they did not have a significant impact on the activity choice. Namely, we interacted time of day and day of the week with ticket type, employment status with trip purpose and with age; gender with age and with accompanying people; internet access mode (wi-fi or mobile data) both alone and interacted with age. We believe that some of these effects failed to have a significant impact on choices partly because of correlation between some independent variables (such as ticket type and purpose, as mentioned above).
The last coefficient in Table 3 is a scale parameter. This was estimated because the dataset used for analysis was collected at two points in time and there is a potential for there being different levels of random error in the data-which could be due to many different unobserved factors. The scale parameter shows that there are indeed significant differences, and that the magnitude of the coefficients for 2017 is about 26% larger than the 2018 ones.

Model validation
In the section above, we presented our chosen model. In order to assess the goodness of fit of this model to explain the data, we cannot use relative measures such as the log-likelihood of the model, as we are not comparing different (nested) model structures. We therefore test its goodness by means of its predictive capability, i.e. how the model reproduces the choices observed in the data. Table 4 below shows, for each activity, the share of people performing it in the data and as predicted by the model estimated above, i.e. the "discrete" dimension of the choice. This is a so-called "base prediction", obtained by sample enumeration. As the paired t-test in the the rightmost column shows, the difference between the shares observed in the data and the predicted shares is not significant, confirming the goodness of fit of the model in Table 3.

Policy analysis
One of the additional advantages of employing MDCEV in the present context is the potential to forecast changes in behaviour (reflected by discrete and continuous components) in response to policy measures. The standard approach to forecast with the MDCEV model is the efficient algorithm proposed by Pinjari and Bhat (2011), which is also available within the Apollo package.
In each of the hypothetical policy scenarios described below, we apply the algorithm to obtain forecasts for engagement in activities and time use at the individual level, and use these to explore the changes in both engagement and duration of activities when certain policy measures are enacted.

Case study 1: Facilitation of mobile ICT use
Past research has found that rail passengers try to make use of their travel time and positively value improvements to the in-vehicle environment. Research in Australia (Douglas 2004) found that rail passengers are willing to pay for improvements in rolling stock design, quietness, improved lighting, smoothness of ride and seat comfort. Trains would seem to be associated with relatively high levels of engagement with in-travel activities (Ettema et al. 2010), and against this background several studies have highlighted the specific importance of providing an environment where people can comfortably use their laptop or tablet (e.g. Groenesteijn et al. 2014). This could also imply the availability of tables, electric sockets and connectivity. Such policies could increase the number of people who decide to use their tablets and laptops while travelling. As is common in forecasting analyses, we consider the difference in the key model outputs between two hypothetical extreme situations. In our case, this means the difference in the shares of passengers undertaking the different activities and the time spent in each. In particular, we produce two scenarios: • Scenario A: We study the difference in activity participation and duration between a situation where nobody uses a laptop (A1) and everyone uses a laptop (A2). • Scenario B: We study the difference in activity participation and duration between a situation where nobody uses a tablet (B1) and everyone uses a tablet (B2). 2

Case study 2: Moving from peak to off-peak
Train operators might wish to encourage shifts in ridership from peak towards shoulder peak or off-peak times, to alleviate congestion and/or crowding and moreover better utilise the available capacity. There are a number of measures that can introduced in support of this objective, such as off-peak ticket price reduction or collaborating with employers to offer schemes which incentivise off-peak travel for their employees by offering special fares. If effective, such measures could potentially relieve on-board crowding and, if crowding/congestion have adverse effects on dwell times, could potentially also reduce journey time. Both of the aforementioned benefits could have positive impacts on activity engagement while travelling, which we assess in Scenario C. In this scenario, we compare the two hypothetical situations where the whole sample travels off peak (C1) and at peak time (C2), with random allocation to morning or afternoon peak in the latter case.

Results of the forecasting case studies
In this section, we present the results of the case studies presented above. We consider separately three groups of travellers depending on the duration of their trip. This is done to control for the possibility that the effect on activity engagement may change depending on the overall duration of the trip. In particular, we consider short trips (in-train time lower than 30 min), medium-duration trips (between 30 and 60 min) and long trips (over 60 min). It is worth noting that the nominal journey time for this route is 51 min, so trips longer than 60 min are almost certainly involving delays. Figure 5 shows, both in terms of the percentage of the sample performing each activity, and of activity duration, the forecasted changes in share of the sample performing each activity and time spent for each scenario. Confidence intervals for these predictions has been calculated using the delta method, and only significantly different results within each scenario are shown in the graphs.
The top-left pane of Fig. 5 shows the share of people performing each activity in Scenarios A1 and A2. We observe that, in a move from not using a laptop to using one, we can expect a 20% increase in the number of people taking long trips performing online work and about 25% increase for those taking short trips. This result reflects the strong positive impact of using a laptop presented in Table 3 (Est: 1.187, t-ratio: 9.16) on the likelihood of engaging in online work. Conversely, engagement in the other activities is expected to decrease following such a change in behaviour, with the strongest negative impacts on the number of people performing offline activities and leisure during medium-duration journeys (approx. -7%) and short leisure journeys (-8.3%).
In terms of activity duration (top-right pane), the results are in line with expectations. We see a strong increase in the time spent doing work online for all trip durations, although the increase is stronger for long trips (+ 13.2 min). The duration of offline and leisure activities is reduced for all trip durations, again with a more marked decrease for long trips.
The two central panes in Fig. 5 show the same results for Scenarios B1 and B2. The effect of everyone (vs. no-one) using a tablet has a similar impact as the one observed for laptops in Scenario A, but the changes are smaller in magnitude. In particular, the only significant effect on the share of people performing work activities online is on trips longer than one hour (+ 9%). As in Scenario A, we see a reduction in the share of people participating in online leisure and offline activity for trips between 30 min and one hour, as well as a modest reduction for online leisure in short trips (-3%) and offline for long trips (− 3%).
The pattern of changes in time use are also similar to those observed in the laptop case. As the centre-right graph in Fig. 5 shows, there is an increase in time devoted to online work on long (+ 5.2 min), medium (+ 3.6 min) and short (+ 1.4 min) trips. The largest decrease in time allocation is for offline (− 2.8 min) and online leisure (− 2.4 min) activities during long trips, with smaller decreases for medium and short trips.
Finally, the two bottom panes in Fig. 5 show the share of engagement in the different activities and the time use patterns in Scenario C. No significant effect is observed on engagement in online work for the discrete part of the model (activity participation). We observe very modest increases in the share of the sample participating in online leisure and offline activities, as shown in the bottom-left pane. In terms of time allocation, we observe a small reduction in the time allocated to online work: -1.05 min for long trips, -0.68 min for medium trips, -0.28 min for short trips. Similarly, small increases in time allocation to online leisure and offline activities are observed.
This last exercise highlights a limited impact on activities generated by the shift between peak and off-peak, largely indicating that people will perform the same activities (and for the same duration) in both occurrences, with, on average, small changes.

Application to Hensher equation and travel time valuation
While the scenarios described in this section are mainly produced for illustration purposes, they give an idea of the effect that different policies might have on how people use their travel time, and serve as an indication for more advanced policy analyses. As shown in Section "Relationship to Hensher's equation and travel time valuation", insights into time spent online on work and non-work activities provide indication of the upper and lower bounds of the p * parameter in the Hensher equation, itself of importance to travel time valuation. Figure 6 presents distributions of such bounds for the case studies described in the previous section, as calculated using Eq. 15. Figure 6 demonstrates how the application of the MDCEV-based forecasing methodology can deepen our understanding of policy scenarios associated with on-board ICT use (use of laptop and tablet in Scenarios A and B respectively) or shifts in the timing of travel between peak and off-peak times (Scenario C). In Scenario A, it is possible to observe how the effect of an increased laptop use penetration shifts the distribution of the lower bound rightwards, due to the positive effect that laptop use has on the propensity to engage in online work. On the other hand, the effect on the upper bound is far less pronounced, which is due to the fact that laptop use does not significantly affect the utility of online non-work. Thus even full penetration of laptop use is unlikely to diminish the relative attractiveness of on-line non-work across the entire sample.
A similar effect is observed for Scenario B, although on first inspection, it might appear counterintuitive that the effect is stronger than that for laptops in Scenario A. However, this is most likely due to the fact that in Scenario B1, it is already possible to observe a secondary (though lower) peak in the density, reflecting the effect of higher laptop penetration in the sample despite the absence of tablet use (recall Table 1). Hence, overlaying the full penetration of tablet use will shift the density more to the right given the pre-existing 1 3 higher penetration of laptop use in the sample, as compared with overlaying full laptop penetration onto less prevalent tablet use, as applied to Scenario A1. In both scenarios, however, the net effect is a narrowing of the range of p * , which is in line with the interpretation of Eq. 15 in Section "Relationship to Hensher equation and travel time valuation". This can be contrasted with Scenario C, where differences between C1 and C2 are hardly noticeable, since the peak time variable turns out to have a much weaker effect on online activity participation. In other words, higher penetration of ICT, stimulating participation of online work (according to the current data) provide more precise (narrower bounds) insights concerning the true underlying values of p * . Arguably, the results are encouraging from the point of view of demonstrating how ICT infrastructure and understanding online behaviour can assist in broader transport policy considerations, such as valuing travel time savings.
In terms of relating our findings to the existing empirical research, we observe consistency between the values of p * observed in our study, both in the data and in the different policy scenarios, with those reported elsewhere. In particular, the summary of past empirical studies concerning the HE provided by Wardman and Lyons (2016) indicated a value of 0.46 for p * in a similar context to the current study, i.e. UK rail, based on the 2008 SPURT study (Mott MacDonald et al. 2009). Furthermore, their review indicated substantial heterogeneity in the p * values across modes, countries and journey durations. In fact, Wardman et al. (2015, p. 202) point out that 'many of these [HE] parameters are specific to the individual trip being undertaken'. Combined, the above findings provide a further piece of evidence that the proposed approach of computing estimates of p * using the MDECV approach, being capable of flexibly reflecting inter-and intra-individual and inter-trip variation through suitable covariates, offers a promising avenue towards incorporating the notion of productive travel time (travel-based multitasking) in investment appraisal frameworks and transport policy more broadly. Similarly, we also note the potential to harness the data from ICT infrastructure to derive more comprehensive and more regularly updated investment appraisal parameters, thus contributing towards even more data-and evidencedriven transport policy-making.

Conclusions
The analysis reported in this paper has delivered new insights into the choice and duration of online activities while travelling by train. As we highlighted at the outset, such insights are timely, given that many public transport authorities across the world are considering the business case for investment in technologies that would enhance online activity whilst travelling. For example, the debate is vigorous in the UK, where the Department of Transport has announced the formation of a dedicated unit ('Acceleration Unit') tasked with increasing the speed of delivery of transport upgrade projects, including those related to connectivity (Gov.uk 2020). Indeed, this policy aspiration motivated the demonstrator project that generated the data analysed in this paper. As indicated by Pawlak (2020), similar discussions concerning connectivity in transport models are also taking place in other countries, e.g. China (Chin et al. 2019), France (Bounie et al. 2019) and South Korea (Lee et al. 2019). Against this background, the present paper will help to inform more robust methodologies for estimating the expected demand for connectivity under a variety of scenarios-progressing beyond previous methodologies based on ad-hoc assumptions concerning activities and participation rates, e.g. Ofcom (2018a, b). The empirical findings presented in this paper were mostly intuitive and confirmed many of the expectations around the behavioural processes of interest. The findings have also contributed to the academic literature, by applying the MDCEV model to the context of time spent online while travelling. Moreover, the paper has advanced the theoretical link between travel-based multitasking, in particular involving online activities, and the valuation of travel time savings by illustrating how forecasting via MDCEV can yield insights regarding parameters used in such valuations. We show that in this age of progressive digitisation of activities, accompanied by evolution in methods of collecting data associated with such participation (while respecting respondent privacy), the results from this analysis could be used to operationalise sophisticated investment appraisal methodologies, e.g. such as those based on the Hensher Equation.
Nevertheless, we acknowledge that the current analysis also embodies some limitations. First of all, the sample was not intended to be representative of a given population and was relatively small. The latter feature limited our ability to test many interactions between different independent variables, as the resulting groups were not sufficiently large. The ability to test a variety of effects was also limited by the relatively small number of socio-demographic characteristics collected in the survey. Moreover, the data was based on information reported by respondents, and the literature on travel diaries has highlighted how recall-based tools can result in the omission or underestimation of activities with a short duration (Zmud and Wolf 2003). Conceivably, this issue could be more severe for activities that do not involve a specific trip, especially if undertaken while travelling. In our case, this effect was hopefully mitigated by the fact that participants were given the survey form while travelling and invited to complete it either before getting off the train or shortly after.
While we have explored most of the information collected through the survey, we plan to extend the present model to incorporate latent factors related to productivity and perception of the speed and reliability of the connection while travelling. This further analysis could add new insights not only on the determinants of engaging in the three activities, but also on the links between these determinants and the latent factors. This is especially important at the time of a global pandemic, when positive characteristics of rail travel might help restore normal levels of ridership. Finally, to gain a better understanding of the effects of travel-based multitasking on mode choice, a comparative study looking at the activities conduted while travelling by a number of different modes would be an important avenue for extending the research.

Author contributions
The authors confirm contribution to the paper as follows: study conception and design: C. Calastri and J. Pawlak; analysis and interpretation of results: all authors; draft manuscript preparation: C. Calastri and J. Pawlak. All authors reviewed the results and approved the final version of the manuscript.
Funding The first author acknowledges the financial support by the European Research Council through the consolidator grant 615596-DECISIONS.
Availability of data and material (data transparency) Data access requests can be made to the authors.
Code availability (software application or custom code) The code used for this paper is partly freely available in the form of the R Software Apollo (www.Apoll oChoi ceMod ellin g.com), partly custom code that the authors are willing to share with readers upon request.