1 Introduction

Urban population growth and related mobility challenges motivate the expansion of public transport systems in order to cater for increased ridership without loss of end-user attractiveness. The resulting increase in system complexity and congestion imposes increasing challenges for passengers seeking to minimise their effort of travelling as well as for operators in managing the resulting increase in passenger flows. The growing public availability of increasingly specific information regarding public transport (PT) connections, for example, both scheduled and actual departures, has the potential to address both these issues in a cost-efficient way thanks to technological advancements in the electronic dissemination of both pre-trip and en route PT information.Footnote 1 In the literature, these electronic passenger information systems containing real-time (travel) information—RT(T)I—are sometimes termed advanced public transport (or transit) traveller information systems—A(P)TTIS (Nuzzolo et al. 2015). They can be based on site-specific equipment (signs and displays on vehicles and at stops and stations) or on personal devices such as smartphones and personal computers (Fonzone 2015; Ghahramani and Brakewood 2016; Harmony and Gayah 2017; Mulley et al. 2017). The information content on stationary or vehicle-based displays usually comprises scheduled and actual departure times, while journey planners and the like, available through personal devices, also to an increasing degree include itineraries with updated departure and arrival times of connecting services at transfer points (for example, as described by Cats et al. 2016).

Waiting times, particularly under uncertainty, have been shown to be perceived as being significantly more onerous than other time components of a PT trip (Wardman et al. 2016). However, both perceived and actual waiting times can be mitigated by the adoption of pre-trip or en route information (Brakewood and Watkins 2018). Moreover, in microeconomic consumer choice theory, the role of information is essential in forming the foundation for the individual’s trade-offs between different utilities and disutilities. However, the demand for information may be triggered by situations where: (1) the trade-off between options is obscured by some degree of uncertainty (Chorus et al. 2006; Farag and Lyons, 2008) and (2) the consumer is not sufficiently acquainted with the options in order to having developed habitual behaviour (as convincingly shown by Aarts et al. (1997) in an experimental and hypothetical test with students’ judgment of travel options). In addition, Lyons (2006), who subdivides the use of information into the planning and the execution phase of a trip, underlines the importance of taking the mental effort of pre-trip planning and associated information search into consideration. He was subsequently able to foster these arguments in the findings of a qualitative study on the search of pre-trip travel information (Farag and Lyons, 2008) where he found no evidence for modal shift as a behavioural response to information—instead, information is mainly sought ahead of performing complex or unfamiliar journeys and/or where there is uncertainty due to service disruptions. The latter is also indicated in a survey previously presented by Peirce (2003), which, however, was made before the advent of the current widespread use of hand-held internet-access devices among PT passengers (see also Daduna and Voss, 1996).

This rapid adoption of smartphones and associated applications providing RTI among the population of PT passengers in most developed parts of the world has motivated a proliferation of research on how the existence of this information affects passenger behaviour. As Gentile et al. (2016) note, the type of information passengers possess—be it past experience of perceived disutility or through an electronic aid—and where this is acquired, may determine their course of action.

The literature on behavioural impacts and use of RTI may roughly be subdivided into an analytic and an empirical strand. Brakewood and Watkins (2018) provide a comprehensive overview of the literature regarding the effects of the use of RTI on passengers’ actual and perceived waiting times, total travel times, ridership and perceived quality and security. In their synthesis, they report average waiting time gains of 2 min and perceived waiting time reductions by up to 30%, however subject to self-selection in the quoted surveys. Other recent empirical evidence of pre-trip information use is provided by Mulley et al. (2017) in their survey of awareness and usage of various information sources in metropolitan Sydney. According to their results, mobility apps and the like are primarily used by experienced PT users, while infrequent users tend to be more reliant on word-of-mouth and websites. In their web survey of a random sample of US citizens, Harmony and Gayah (2017) found that smartphone apps were the preferred medium for obtaining RTI for PT departure times. A similar result was obtained by Fonzone (2015) in his bus stop and vehicle-based survey of the RTI use of PT passenger and related trip attributes. He found widespread use of stationary RTI media (three out of four trips) and on-line journey planners accessed via computer or mobile phones (one half of trips), mainly with the aim of reducing waiting times or determining an appropriate departure time from the trip origin. According to the study, the choice of route was the stage of the trip that was mostly affected by information messages. Thus, it was used more in advance of or during trips in which multiple PT lines or stops were available in the perceived passenger choice set. According to Brakewood and Watkins (2018), only analytical studies have analysed the overall effects on total travel time from the provision of RTI. One such example is provided by Cats et al. (2011) in the results from their mesoscopic dynamic model of the Stockholm metro. They arrive at a 3–4% total gain in travel time as an effect of RTI provided at platform, station or network level, with the higher figure for the latter level. During travel disruptions, these effects were accelerated by up to 11% compared to a non-RTI scenario. To validate the different route choice modelling approaches, also regarding passengers having access to RTI regarding departure times, Fonzone and Schmöcker (2014) simulated three hypothetical approaches to PT travellers’ use of pre-trip information on the classical linear formation of optimal route choice strategies between sets of attractive routes, as originally suggested by Spiess and Florian (1989). Moreover, the authors discuss the effects on passenger behaviour from the availability of RTI regarding the adaptation of duration and location, i.e. which stop to choose for the first waiting time of a trip and when to depart from the previous location or activity. The passenger’s optimisation strategy would then target the maximisation of productive time, rather than just minimising travel time. The results from Monte Carlo simulations indicate the significance of how RTI is visualised and used, and how these different usage strategies can influence the total system travel time. Finally, the issue of information accuracy has been addressed by analysts such as Ben-Elia et al. (2013) in their case as a unimodal Stated Preference (SP) experiment with motorists’ hypothetical choice of routes, and by Li et al. (2018) in a bimodal intra- and inter-day dynamic model setting. The former authors found that the reduced accuracy of travel time information resulted in increased randomness in choice and a shift from unreliable to reliable (but sometimes longer) routes and that prescriptive information had a greater impact on route choice than descriptive information. Their results also suggest that discrepancies between expected travel time (derived from experience) and predicted travel time according to RTI can lead to risk aversion behaviour and that travellers’ use information despite inaccuracies in order to “anchor” their choice decisions.Footnote 2 In the latter survey, Li et al. (2018) found that the accuracy of RTI has a significant impact on the learning curve, and thus the adaptation rate, of route choice decisions. Reliable, or at least not systematically inaccurate, RTI leads to more rapid equilibrium stabilisation, while incorrect RTI reduces travellers’ receptibility to the information and thus their willingness to adapt.

The significance of new, emerging sources of trip data in order to potentially delve further into the revealed behaviour of PT passengers has been emphasised by many researchers, e.g. Wang et al. (2018); Liu et al. (2010); Gadziński (2018) and Lee et al. (2016). This discourse indicates the relevance of capturing the potentially rich empirical data on actual choice strategies used by passengers from dedicated smartphone-based survey apps. Thus, drawing on these opportunities, in this paper we analyse data obtained from a user-mediated prompted recall (Stopher et al. 2015) mobile application-based travel survey (for details regarding this survey, for example, an extensive description of survey sample properties, see Berggren et al. 2019). In addition to user-revised trip trajectories and activities, data from the survey—which was carried out in the regional PT system of Scania, Sweden—also include stated passenger planning and optimisation strategies and the usage rate of departure time information ahead of PT trips based on context-aware notification prompting (Turner et al. 2017). Thus, the overarching aim of our study is to contribute to the indicated need for empiricism regarding the relationships and possible correlations between the use of pre-trip and en route PT travel information, passenger planning strategies and PT supply characteristics such as headway, departure reliability and in-vehicle travel time. We therefore aim to contribute to the knowledge regarding potential impacts on waiting times from pre-trip planning and information usage.

Our focus in the study has been to explore the following research questions:

  1. 1.

    Are the effects of (the stated use of) planning strategies and usage of pre-trip information directly reflected in the revealed waiting times among PT passengers, or are there confounding factors?

  2. 2.

    How are different forms of travel information utilised by PT passengers, e.g. for which activities and trip purposes is travel information used and at which locations? What characteristics of different passenger groups and contextual factors during trips matter when it comes to the utilisation of trip planning and optimisation strategies?

In our case, we have defined optimisation strategies as whether the traveller indicates a specific arrival or departure time as desired in a digital journey planner before heading of on the PT trip.

The rest of the paper is organised as follows: Sect. 2 outlines the data and methods that form the basis of our analyses and contains specifications of the models we applied to test our assumptions. Section 3 contains the main results. Finally, in Sect. 4, we conclude our findings and point to directions for further endeavours in the field of PT passengers’ strategies and usage of travel information.

2 Method

2.1 Data collection

The survey—in which 136 persons during a 14-day period in November 2017 reported a total of 13,495 trip legs out of which 2970 were undertaken by PT modes (bus and train)—was performed in the Malmö-Lund area of Southern Sweden. Survey participants, whose characteristics are indicated in Fig. 1, were recruited manually based on convenience sampling over five consecutive weekdays, through handouts in the PT system,Footnote 3 thus making the sample very suitable for direct analyses of travel behaviour in this system. The smartphone survey application, TRavelVU (Clark et al. 2017), is semi-automated, meaning that context data such as GPS readings and accelerometer data are collected automatically by the participants’ phones. Thus, the positions of trip breaks such as boarding, alighting and change of transport modes were recorded and transport modes were inferred in a back-end support system continuously connected to the phones involved. In addition, context-sensitive notifications were transmitted to the participants once a PT trip leg preceded by a trip leg consisting of an access leg (walk, bike or car) was completed, asking for planning strategies and information used for this trip (the exact wording of the questions is provided in Table 2 in Sect. 2.2). The GPS trajectories from the survey were fused with auxiliary data regarding both scheduled and actual PT vehicle trajectories from GTFS and AVL data sources (the method is described in detail in Berggren et al. 2019). This enabled us to relate travel behaviour for each trip segment to corresponding PT service trip characteristics and level of service.

Fig. 1
figure 1

Properties of the survey sample, based on questionnaire replies from the TRavelVU app

A few important definitions were used by the application to distinguish between activities and movements. Thus, an activity was recorded if the phone was within a square of 100 by 100 m for at least 2 min. Consequently, the en route activities “transfer” or “wait” were only recorded by the application if the duration was at least 2 min, and other transfers and waiting times had to be extracted from the produced itineraries by utilising the sequence of used transport modes. Transfers and waiting times below 2 min were assigned random durations in the interval [0,2] (Leif Linse, personal communication, 16 November 2016).

2.2 Data analyses

The research questions were explored using straightforward statistical tests including Chi square, linear regression and univariate ANOVA models, specified based on our empirical data regarding stated passenger planning (pre-trip planning or not?), optimisation strategies (arrival or departure time) and information usage (usage vs. non-usage and information source, if usage) in relation to explanatory variables such as individual characteristics and trip attributes based on scheduled PT vehicle trajectories. Two models were deployed, including dependent and independent variables as listed in Table 1. We used First Waiting Time (FWT) and Transfer Waiting Time (TWT) as indicators of passenger behaviour. The rationale behind this choice of dependent variables is that they are (1) relatively easy to measure given the survey methodology we used and (2) correspond to important decision points (or diversion nodes) during a PT journey, in both time and space (Gentile et al. 2016; Nuzzolo and Comi 2017).

Table 1 Models and variables used to explore our first research question (the results are presented in Sect. 3)

The trip purpose was inferred from the stated activity at the end of each trip. Consequently, “previous activity” was the activity recorded ahead of each trip. Home-ends and activity-ends were distinguished using a separate variable to enable analysis of potential behavioural differences between these (inspired by the approach applied by Hoogendoorn-Lanser et al. 2006). Stop type was inferred from line route trajectories. The algorithm through which these were inferred, in turn, is further described in Berggren et al. (2019). Gender, Occupation and Flex time were taken from responses of an enquiry in the survey app (see Appendix). Finally, day type and time of day was inferred from the time stamps for each GPS reading representing start and end point for each trip leg in the survey.

Regarding the push notifications that were prompted to respondents in order to survey their planning strategies and use of travel information, the questions and response options available are presented in Table 2.

Table 2 Questions prompted to survey respondents after each PT trip segment

Influenced by Csikos and Currie (2008), and their aggregation of first waiting times (FWT) from smart card data into four distinct archetypes of passenger behaviour regarding FWT based on the distribution of waiting times for individuals, and in relation to the number of departing lines, we also analysed FWT distributions defined by the aforementioned author’s four archetypes—“Like clockwork”, with minimal FWT of, at the most, a few minutes; “Consistent within a wider window”; “Consistent plus outliers” and “Largely random”, respectively. We used the median differences between the upper and lower quartiles as a measure of FWT variability and defined the four archetypes by using the four quartiles of these medians (thus, respondents were grouped into four equally large archetype groups). The rationale behind this choice of measure, as also discussed by Csikos and Currie (2008), is to eliminate outliers. Based on these definitions, we performed cross-tabulations with Chi square tests between the four FWT archetypes and the stated planning and information usage strategy variables, to elucidate the validity of the former.

Cross-tabulations, along with non-parametric Chi square tests (see Table 3), were applied to test the potential influence of personal characteristics and trip-related attributes on the stated planning and optimisation strategy or information usage.

Table 3 Non-parametric tests applied to explore the various impacts on stated strategies and information access (results are presented in Sect. 3)

The correlation between the stated planning and optimisation strategies and usage of pre-trip information was controlled for by evaluating Pearson’s r and Spearman’s ρ from pairwise correlation tests. The next section presents the results from these models and tests, as well as the methodology applied to produce data for the variables used in the models and tests.

3 Results

3.1 Overview of notification responses concerning strategies

Proportions of trip segments performed under different planning and information usage strategies, according to responses to phone notifications of our survey participants, are presented in Table 4, where each table refers to a question posed to the participant by the survey app during or just after completion of a trip segment, thus somewhat reflecting particular contextual choice situations. It should be noted that the proportions refer to trip segments and not to individuals, meaning there is a risk of over-representation of single individuals. However, only four out of 136 respondents did not respond at all to these questions.

Table 4 Stated strategies for pre-trip planning and information use, as indicated by survey responses (on trip segment level)

The spread of planning approaches (planning or not planning ahead of a trip) was analysed with respect to individual respondents. The responses vary somewhat more across individuals than for each individual. Out of the 132 respondents who delivered valid data, only 1.6% stated “Planning ahead” for all trip segments. The mean proportion of planned trip segments was 55% with a standard deviation of 40%. Note that these figures are trip segment-based and the mean number of PT trip segments per trip is 2.46 in the sample. However, we were also able to measure the proportion of planned PT trips instead of trip segments, and we found that 57% of PT trips were actually planned ahead (or contained at least one trip segment which was pre-planned) using a timetable or journey planner, according to the replies in the prompted-recall survey.

3.2 Possible relationships between stated planning and information usage strategy, and revealed waiting times

The results from our ANOVA models (cf. Table 1), in which FWT and TWT were tested with regards to the stated use of planning and information usage strategies, as well as a number of other explanatory variables, are shown in Tables 5, 6 and 7. Since model 1 for FWT did not indicate significant results, it has been omitted here. As indicated from model 2, however, the use of a deliberate planning strategy has a significant impact on waiting times, be it FWT or TWT. Tamhane’s post hoc tests reveal a mean difference of 42 s (p = 0.002), where trips that were stated as having been pre-planned thus entailed longer FWT than trips with no stated pre-planning, whereas the stated pre-use of information entailed, on average, 25 s less than the stated non-use. However, the latter result is not significant (p = 0.057). These results are illustrated in the estimated marginal means plots of Fig. 2. According to the ANOVA model, an important determinant for FWT also appears to be trip purpose in interaction with stop type, followed by trip purpose, stop type and day type, as indicated by their respective explanatory power.Footnote 4

Table 5 Results from univariate ANOVA (model 2) with FWT as dependent variable
Table 6 Results from univariate ANOVA (model 1) with TWT as dependent variable
Table 7 Results from univariate ANOVA (model 2) with TWT as dependent variable
Fig. 2
figure 2

Effect on estimated marginal means of FWT from stated re-trip planning and information use, respectively

The effect of information on FWTs is also indicated by comparing the distribution of FWT, at different headways, for trip segments with and without stated information pre-use (Fig. 3). Despite a somewhat heterogeneous overall picture, for single and multiple PT service trajectories with a combined headway of 10 min, there was a tendency to display FWT minimisation behaviour for users of pre-trip information, while non-users have multiple FWTs closer to durations representing half of the scheduled headways. For single stop pairs serviced by single line routes with a scheduled headway of 10 min, the mean FWT was 1 min, 10 s shorter for trips where respondents stated use of pre-trip information. Yet, for a 15-min headway, the FWT was approximately the same amount of time longer when pre-trip information had been used than when it had not (both results are significant at a 5% level according to two-sided t tests). For multiple line stop pairs, similar results were obtained. However, they were only significant for a 15-min headway.

Fig. 3
figure 3

Probability density functions of FWTs for trip segments where respondents stated use and non-use of pre-trip information, respectively. Diagrams to the left represent trips between origin and destination stop pairs serviced by a single line route, whereas diagrams to the right represent trips made between stop pairs serviced by multiple line routes

The effect on TWT of pre-planning is indicated by the results of a Tamhane’s T2 post hoc test associated with the ANOVA model presented in Table 6 and 7. These results indicate a two and a half minute longer TWT for those respondents who claimed to have planned ahead of their trip, compared to those who did not plan ahead. However, the stated use of digital pre-trip information entails 1 min and 38 s less TWT compared to no such use (see also Fig. 4). However, trip duration appears to be an underlying factor affecting both transfer waiting time and planning strategy, as indicated by the finding that there is a weak positive correlation between TWT and trip duration (standardised coefficient of 0.119 and adjusted R2 = 0.013, see Fig. 5 for a graphic representation). A similar tendency is present in the FWT data.

Fig. 4
figure 4

Effect on estimated marginal means of TWT from stated pre-trip planning and information use respectively

Fig. 5
figure 5

Individual transfer waiting times regressed against trip duration (origin to destination)

The interaction between trip duration and choice of planning and information strategy is further corroborated by results from a significant Chi square test of planning strategy against trip duration (Table 8), in which there is an over-representation of trip segments (relative to the expected number) with respondents stating that they planned ahead of the trip for trips of more than 1 h’s duration (observed: 200; expected: 157). On the other hand, there is an underrepresentation of pre-planned trips (relative to the expected number) among trips lasting less than 30 min (observed: 19; expected: 32) and the opposite applies to trips for which the respondent stated that they did not plan ahead (observed: 54; expected: 95 for trips exceeding 60 min and observed: 34; expected: 19.4 for trips below 30 min in duration).

Table 8 Results from Chi square tests on stated strategies for pre-trip planning and information use

Further significant results regarding TWT from Tamhane’s T2 post hoc tests: Transfer times are, on average, 3 min longer at interchange stops than at ordinary urban stops, and 2 min longer than at urban terminus stops. Regarding the outcome of context sensitive notifications on strategic behaviour: For trips for which the use of a travel planner was stated as an information source, transfer times are, on average, 1 min, 30 s shorter than when pre-knowledge of the timetable was stated.

3.3 Other possible explanations for trip planning and information usage strategies

Most trip and respondent-related attributes, except for home vs. activity at trip origin or destination, have significant correlations with both stated pre-trip planning and information use, as indicated by the significance results in Table 8.

There is a certain degree of correlation between stated planning strategy and stated information use, where there is some positive influence of previous knowledge of a timetable or use of a journey planner, respectively, as well as stated use of a planning strategy (observed: 625 and 534, expected: 450 and 474, respectively). Similarly, when cross tabulating the aggregated variables of pre-trip planning and usage of information, there was an under-representation of information usage for pre-planned trips (observed: 539; expected: 578). There is also only a very weak correlation between having pre-knowledge of the timetable and stating not having used written information ahead of leaving the trip origin to head off for the first bus stop of a journey (observed: 229, expected: 274), which is reasonable assuming the respondents interpret this alternative as meaning that they already possessed the information they required (thus, no support for an assumption of purely random, non-planned behaviour).

For trips using services with headways below 5 min, respondents stating not having pre-planned are over-represented in the data (in relation to the expected number, observed: 202; expected: 168). The same pattern holds for trips using unreliable lines (obs. 311, exp. 277 for lines with a reliability indexFootnote 5 below 0.26), trips starting from urban stops (obs. 329, exp. 310), for short trips (obs. 171; exp. 232 for trips longer than 60 min), for work (commuting) trips (obs. 228; exp. 218) or trips from work (observed: 203; expected: 192), for shopping trips (obs. 77; exp. 69), among employees (obs. 453; exp. 446) and people who travelled with PT more than 14 times during the 14-day survey period (true for all intervals, e g 14–21 times; obs. 291; exp. 256). On the other hand, pre-trip planning is over-represented for trips made during off-peak daytime (obs. 281, exp. 248) and women stated pre-planning to a higher degree than men (obs.w 726, exp.w 670 vs obs.M 438, exp.M 494) and this is also the case for people above 50 years of age (obs.51–65 467, exp.51–65 409). Trips made from interchanges and rural stops are also over-represented among the pre-planned trips (obs. 419; exp. 384 and obs. 14; exp. 11, respectively).

Consulting digital travel planning aids are over-represented for trips made on Saturdays and Sundays (obs. 60 and 36; exp. 46 and 29, respectively), trips made from urban stops (obs. 436, exp. 401) during off-peak daytime (obs. 234; exp. 210) by men (obs. 396; exp. 358), less frequent PT travellersFootnote 6 (obs. 100; exp. 70) and very frequent PT travellersFootnote 7 (obs. 52, exp. 35), by young travellers (obs.20–35 years 504, exp.20–35 years 391), as well as by students (obs. 370; exp. 311), while pre-knowledge of the timetable is over-represented among women (obs. 556, exp. 514), on work trips (obs. 138; exp. 111) and for travellers above 50 years of age (obs.51–65 years 137; exp51–65 years 128). The scheduled headway appears to affect whether or not respondents knew the timetable by heart. For stop pairs by high-frequency direct PT connections, with a combined headway of 5 min or less, there is an under-representation of pre-knowledge of the timetable (obs. 74; exp. 85), while at 10-min combined headway, an opposite pattern (obs. 147; exp. 129) emerges, indicating that this particular headway appears to be easier to recall than others. Also, the reliability of the line appears to affect the information usage strategies; trips using lines with low reliability (reliability index at 0.25 or below) are under-represented among users of travel planners (obs. 398; exp. 426) but over-represented among respondents who stated that they did not pre-consult departure time information (obs. 311; exp. 273) and among respondents who reported no pre-knowledge of the timetable (obs. 508; exp. 470).

3.4 Potential factors influencing the stated use of optimisation strategies

Analysing potential explanations for the use of optimisation strategies in our data, as manifested by respondents who stated their desired arrival or departure times in a digital journey planner at the pre-trip planning stage, we found a significant correlation with stated pre-trip activity (Table 9). Thus, being at work means a degree of over-representation of selecting departure time at the pre-trip planning stage (obs. 61, exp. 50). Trip purpose, i.e. the activity performed after the trip, has significant influence on the stated choice of desired time of departure or arrival, respectively, when planning the trip with a travel planner. The clearest results were obtained for school trips, where there was an over-representation of arrival time selections with an observed value of 61 compared to an expected value of 52. For trips to work, departure time was somewhat over-represented in pre-trip planning with obs. 272 and exp. 262.

Table 9 Results from Chi square tests on stated optimisation strategy in journey planner

When trip origins and destinations are grouped according to whether belonging to the home or activity end (cf. Hoogendoorn-Lanser et al. 2006), the results indicate that there is a weak tendency (Pearson Chi Square p value of 0.099) for trip origins at the activity end to apply a departure time optimising strategy (obs. 586; exp. 572), whereas, at the home end, respondents tend to be over-represented in the arrival time optimising group (obs. 198; exp. 184).

According to our data, gender has a significant influence on optimisation strategy. Thus, men are under-represented in the arrival time optimising category while women are over-represented (obs.M 128; exp.M 183 and obs.W 356; exp.W 301). As for the departure time optimising strategy, the opposite condition applies (obs.M 408; exp.M 353 and obs.W 524; exp.W 579 for men and women, respectively). There is also significant influence on the choice of optimisation strategy from: (1) Stop type when first boarding (urban locations have an over-representation of departure time optimising strategy), (2) respondent occupation (students were over-represented for the arrival time optimisation strategy), (3) flexible working time (over-representation for arrival time optimisation for respondents who do not have this employment type), (4) age (over-representation for arrival time optimisation for 20–35 year-olds) and the (5) number of PT trips made during the survey period (under-representation for arrival time optimising for respondents who made less than one trip on average per day).

3.5 Waiting time archetypes and potential explanatory factors

When analysing the spread of waiting times in relation to the stated strategies, we used the categories, or archetypes, proposed by Csikos and Currie (2008) regarding cumulative distributions (CDFs) of median differences between the upper and lower FWT quartiles (Note that Csikos and Currie denote the waiting time Arrival Offset instead of FWT). In Figs. 6, 7, 8 and 9, CDFs of median FWTs across individuals are shown for each archetype, or quartile of differences between the upper and lower quartile of FWTs from the total sample. When compared with the corresponding profiles in the study by Csikos and Currie, there are some similarities to the first (“like clockwork”, Fig. 6), the third (“consistent plus outliers”, Fig. 8) and the fourth quartile (“largely random”, Fig. 9), while the FWTs of the second quartile (“consistent within a wider window”, Fig. 7) have less consistency for our data. In general, our data contain a narrower range of FWTs than Csikos and Currie, with a mean difference between the upper and lower quartiles of just 3:27 min and a standard deviation of 2:43 min (for Csikos and Currie, these mean values range between 11:48 and 16:36 min with standard deviations in the interval [16:36, 25:18] minutes depending on the analysed station).

Fig. 6
figure 6

Archetype “like clockwork” according to Csikos and Currie 2008

Cumulative distribution of median First Waiting Times (FWT = x) for the first quartile of differences between the upper and lower quartile of FWT

Fig. 7
figure 7

Archetype “consistent within a wider window” according to Csikos and Currie 2008

Cumulative distribution of median First Waiting Times (FWT = x) for the second quartile of differences between the upper and lower quartile of FWT

Fig. 8
figure 8

Archetype “consistent plus outliers” according to Csikos and Currie 2008

Cumulative distribution of median First Waiting Times (FWT = x) for the third quartile of differences between the upper and lower quartile of FWT

Fig. 9
figure 9

Archetype “largely random” according to Csikos and Currie 2008

Cumulative distribution of median First Waiting Times (FWT = x) for the fourth quartile of differences between the upper and lower quartile of FWT

When cross-tabulating the FWT archetypes with the variables of the stated planning strategy and the use of pre-trip information, significant Chi square results corroborate our ANOVA findings reported above, in that deliberate pre-planning does not automatically result in systematically shorter FWTs (for non-planned trips: obs.“like clockwork” 144; exp.“like clockwork” 116 and obs.“largely random” 177; exp.“largely random” 184). On the other hand, the deliberate use of planning aids or consulting printed timetables has a structuring effect on FWTs (for trips where planning aids were used: obs.“like clockwork” 180; exp.“like clockwork” 139 and obs.“largely random” 189; exp.“largely random” 232).

When analysing FWT spread archetypes across respondent characteristics using Chi square tests, we found (Table 10) that employees were over-represented in the first archetype (“like clockwork”, obs. 276; exp. 239) while students were over-represented in the “largely random” quartile (obs. 278; exp. 203). Age also has a significant influence on FWT archetype (respondents 20–35 years of age were over-represented in the “like clockwork” quartile—obs. 216; exp. 159—while those in the 51–65 age group were under-represented in the “largely random” quartile—obs. 36; exp. 87). Concerning gender, there are some interesting patterns in the data, but on a low significance level (linear-by-linear association significance of 0.069). Women are over-represented in both the lowermost and uppermost quartile (obs. 275 and 410, respectively; exp. 229 and 350, respectively) while men are over-represented in the “Consistent within a wider window” and “Consistent plus outliers” archetypes (obs. 319 and 291, respectively; exp. 242 and 262, respectively).

Table 10 The results from Chi square tests on first waiting time archetypes

4 Discussion on methodology and results

The overall results of our study provide further evidence that the use of pre-trip information reduces actual waiting times. The effect is seen both at first boarding stop and at transfers. These results are in line with most other studies in this field (Brakewood and Watkins, 2018), although our results indicate a somewhat smaller waiting time gain than most other studies and that FWT gains are mostly confined to certain departure frequencies—especially to lines with a 10-min scheduled headway. For shorter headways, the lack of significance may relate to a limited potential of travel time savings.

The significantly negative effect of pre-trip information usage on waiting times at 15-min headways—which relates to a weak, but significant, positive correlation between trip duration and headway on the first PT trip leg (coefficient: 0.065; R2 = 0.004)—provides further evidence of a kind of “planning paradox”, as mentioned in Sect. 3.2. This implies that for longer, non-routine trips (i.e. non-commutes), for which PT service headways are longer, pre-trip planning and information use is undertaken more extensively than for shorter and commute trips (supporting the findings of Peirce 2003). This also results in a more extensive use of pre-trip information and longer waiting times for the former (non-routine) trips than for the latter (familiar trips). Thus, more random behaviour (“board a PT vehicle on whatever line arrives first”) could relate to a high level of travel routine, while unfamiliar trips are associated with a higher tendency to stick to a specific line and/or departure. The relatively small-scale waiting time effects might be a result of our approach to collecting trip data. In pilot studies, the TRavelVU app has sometimes been found to include short walk legs in what travellers would regard as being waiting when they are in fact walking around on a railway platform, for instance. Extended waiting times are sometimes used to run errands, etc.

Regarding pre-trip information utilisation and planning strategies, our study somewhat corroborates the findings of Mulley et al. (2017) and Farag and Lyons (2008). Thus, we found a positive relationship between a very high PT trip rate and the use of different (digital) sources of pre-trip information, even though the relatively short survey period renders our measurements of trip rate somewhat uncertain. Of more interest, perhaps, is the significant differences in information usage between gender and age groups. According to our results, women tend to plan ahead to a larger extent than men, and younger travellers use digital tools to a higher extent than elderly travellers (corroborated by Ghahramani and Brakewood 2016 and Farag and Lyons 2008, the age component of the use of digital planning aids has been further studied by, for example, Velaga et al. 2012).

Returning to our initial research questions, our results suggest that the duration of a trip is a confounding factor for waiting times (both FWT and TWT) and the use of deliberate pre-trip planning. This is somewhat contrary to our initial expectations, and also in relation to the results of Fonzone and Schmöcker (2014), who show that the more structured traveller [the Busy (4) approach] gains a substantial amount of time in relation to the less structured traveller (ASAYC and strategic approach). However, in a real-world setting such as our study, it is clear that the significant range of trip durations comes into play to a much higher extent than in the idealised network applied by Fonzone and Schmöcker (2014). Even so, our findings corroborate their results regarding pre-trip information, although we only measure waiting times and not the duration of complete OD trips.Footnote 8

The split and relationships between departure and arrival time and passenger and trip attributes such as the flexibility of working hours were thoroughly investigated by Thorhauge et al. (2016) in their modelling of departure times and willingness to pay for avoiding a changed departure time interval. In a sense, our results corroborate their findings of the greater significance of travel time optimisation for trips made by individuals who lack flexible working hours, as indicated by the prevalence in our data of “like clockwork” FWT behaviour and arrival time optimisation.

In one sense, our results regarding FWT archetypes could be considered to be counter-intuitive; over-representation in the “largely random” archetype for trips in which the respondents stated that they used a planning strategy. In our view, these results could relate to the “planning paradox” related to the trip durations mentioned previously (longer trips may require more planning, as well as longer waiting times). Also, the correlation between FWT archetype and reliability appears to be quite weak, with a linear-by-linear association significance of just 0.069.

The relatively low explanatory power of the variables indicating the use of information usage and planning strategies in our ANOVA models may relate to the timing of the notifications sent to the survey participants. The term interruptibility, as introduced by Turner et al. (2017), implies suitable moments for being able to respond to smartphone-distributed push notifications. The tendency in our results that travellers repeat previous replies when prompted in this way may relate to the level of mental ability of the traveller en route (also perhaps an effect of habit, as investigated by Verplanken et al. 1997). The high level of intra-personal correlation is the clearest indication of this tendency, which may represent a bias in relation to true behaviour regarding pre-trip planning and information use, thus being a potential contributing factor to why these strategy variables are not significant in our ANOVA models of FWT and TWT. As very few other studies employ our methodology (or a similar methodology), there is a clear need for further empirical observations and related improvement of the methodology. As we have not been able to control for selection bias in our survey sample in relation to the population under study, caution is recommended when generalising our results to other contexts. For instance, and as other authors have found (Gadziński, 2018; Greaves et al. 2015), participant attrition due to phone battery drainage or perceived survey fatigue (Assemi et al. 2018) is a common reason for leaving this kind of survey. In our case, this resulted in 36 persons (out of 172 registered) not recording any data (for a further discussion, see Berggren et al. 2019).

5 Conclusions

We used the results from a user-mediated smartphone survey, collecting trip data utilising a dedicated application, in order to investigate and explore the use of pre-trip information and of planning and optimisation strategies among passengers in an urban/regional PT route network with frequent occurrence of departure time uncertainty. We found that pre-trip planning and information use had significant effects and differed depending on scheduled departure frequency of the line route of first PT leg, on the strategically important trip segments’ first waiting times and transfer waiting times. Moreover, our results indicate that the stated uses of planning strategies and pre-trip information related to trip purpose and duration, previous activity, day type and time of day, line reliability, respondent age, gender and occupation, stop type and the number of trips made. Thus, pre-planning was more ubiquitous among infrequent PT travellers, women, and among travellers that make longer trips, trips starting with a reliable line route at the first PT leg and for trips in urban contexts. The elevated pre-use of information was evident for longer trips made at weekends or during off-peak daytime using reliable PT lines in urban areas by young, male, less familiar or very frequent PT users and students. In addition, we were able to obtain reasonable FWT archetypes, as proposed by Csikos and Currie (2008), and discussed how to use these as indicators of different strategic passenger behaviour by relating them to respondent characteristics. Here, we found that trip duration influenced both FWT and whether or not passengers pre-planned their trip, thus supporting the findings of Farag and Lyons (2008). The use of information such as journey planners or printed timetables prior to departure suggests shorter FWT, thus corroborating the results of other researchers (Brakewood and Watkins, 2018). The results may form a basis for the design and marketing of information resources for different PT user groups. From our results, it appears that there are aspects of the APTTIS system that should be improved or changed in order to also be of use to travellers using unreliable lines at very high departure frequencies.

In future studies, it would be interesting to apply a nested or hierarchical approach to the information retrieval and planning process in order to relate these processes to each other. A future approach could also be to estimate route choice models on our revealed trip data, thus elucidating further behavioural traits depending on possession and usage of pre-trip and en route information regarding departures at origin and transfer points along the course of a PT trip. This latter approach could also include additional passenger groups, based on more detailed information on individual characteristics.