A minute of your time: The impact of survey recruitment method and interview location on the value of travel time

Web-based stated preference (SP) surveys are widely used to estimate values of travel time (VTT) for cost–benefit analysis, often with internet panels as the source of recruitment. The recruitment method could potentially bias the results because (1) those who frequently participate in surveys may have a lower opportunity cost of time and (2) people who answer the survey at home or in the office may answer differently because the choice situation is less salient to them. In this paper, we investigate both mechanisms using data from a VTT choice experiment study where respondents were recruited from an internet panel, an alternative email register or on-board/on the station. Within all three groups, some complete the survey while making an actual trip. We find that respondents who were recruited from the internet panel or report being members of a panel have a significantly lower VTT, suggesting that internet panels are less representative in this respect compared to other recruitment methods. We also find that those who answer while traveling have a higher VTT, possibly because the benefits of saving travel time are more salient to them than to those who answer while not traveling.


Introduction
Survey-based stated preference (SP) studies are widely used to estimate values of travel time (VTT) for application in cost-benefit analysis (Wardman et al. 2016;Flügel and Halse 2021). This approach has at least two important potential drawbacks: One is that those who spend time answering a survey may have a lower opportunity cost of time than the typical traveler. This is an issue of representativeness that can only partly be accounted for by controlling for observed respondent characteristics. The other potential drawback is that people might choose differently in a hypothetical choice setting than when making actual travel choices. This issue relates to the choice context and interview situation.
The influence of these factors is likely to depend on the survey recruitment and interview method chosen. First, internet panels are increasingly used as a source of recruitment for surveys in social science , including in SP surveys (Sandorf et al. 2022). While all voluntary recruitment potentially implies self-selection, those recruited from an internet panel will typically have spent more time answering other surveys before, and a significant share of them may be motivated by the relatively small compensation paid per minute. This could imply that panel members are less representative in terms of time and/or cost preferences than respondents recruited from other sources. Using data from the 2009-2011 Dutch VTT study, Significance et al. (2013) and Kouwenhoven et al. (2014) found that panel members have substantially lower VTT.
Second, respondents are often invited via email and complete the survey online. This applies to studies using internet panels as well as other studies based on e-mail recruitment. An alternative to this is to recruit and interview people while they are traveling. In both cases, the choice questions are hypothetical, but the answers people give could depend on how realistic they perceive the choice situation. Those who answer the survey while traveling are likely to feel closer to the choice situation than those who answer while not traveling, typically at home or in their office. The VTT estimated based on the latter group could suffer from a type of hypothetical bias that could potentially go in either direction. In light of the phenomenon that people perceive experiences as more positive in retrospect (Mitchell et al. 1997), one hypothesis is that those who answer the survey while not traveling are less able to relate to the discomfort of travel or and therefore put less weight on travel time. This factor would also imply a downward bias in the VTT. 1 We investigate both issues based on data from a large-scale SP survey which was conducted in Norway in 2018 with the purpose to estimate national unit values for the VTT. Our data has three important features: (1) Respondents were recruited from three different sources -an internet panel, an alternative email register, and intercept. (2) All respondents were asked whether they were members of an internet panel. This means that we can investigate the effect of internet panel membership on VTT more generally, not just whether members of this particular panel have a different VTT. (3) Those who participated online were asked if they were currently traveling while answering the survey. Hence, we can estimate the effect of interview location 2 (traveling or not traveling), controlling for recruitment method.
We find that those who report to be active members of an internet panel have a significantly lower VTT than non-members, also among those who were not recruited from an internet panel for this particular survey. This suggests that internet panel members have a lower opportunity cost of time which negatively impacts their VTT, and that this may be a general characteristic of internet panels. Although our samples differ somewhat in observable characteristics, the results are highly robust across different model specifications with different sets of control variables. They are also in line with the evidence from the 2009-2011 Dutch study.
We also find that those who answer while not traveling have a significantly lower VTT compared to those who answer while traveling. This effect is also highly robust across specifications. One possible explanation for this finding is that when people are further away from the context that the choice situation mimics, this results in a downward hypothetical bias. This potential mechanism could be further explored in future studies using an experimental approach.
Our study contributes to a small existing literature on the impact of survey recruitment method in VTT studies (Börjesson and Algers 2011;Hanssen 2012;Significance et al. 2013;Lu et al. 2018) as well as to the more general and much larger literature on survey mode effects and the representativeness of internet panels in environmental economics (e.g. Lindhjem and Navrud 2011a;Boyle et al. 2016;Menegaki et al. 2016;Sandorf et al. 2020Sandorf et al. , 2022, and in the survey method literature more generally (e.g. Callegaro et al. 2014;Zhang et al. 2020). As internet panels and web surveys are widely used in SP surveys, partly for cost reasons, our results have potentially strong implications for research and practice related to the value of travel time. Not accounting for these issues could potentially bias the results of cost-benefit analysis of transport projects.
Our paper proceeds as follows: In Section "Previous literature", we review the literature on hypothetical bias and survey methods in VTT studies as well as other SP studies, focusing particularly on the experiences from the 2009-2011 Dutch VTT study. In Section "Survey design and data collection", we describe the survey design of the 2018 Norwegian VTT study, and in Section "Data overview" we present some key characteristics of the data. Section "Empirical modeling" explains our empirical specification and shows the estimated effects of survey recruitment method and interview location on the VTT. Section "Discussion" gives a discussion of the findings and Section "Conclusion" concludes.

Hypothetical choice situations and bias in SP studies
Stated preference (SP) studies are hypothetical in their nature, which means that there is no guarantee that respondents make the same choices in the survey as they would have done in a real-life setting. There is an extensive literature exploring possible biases in SP results due to this (Harrison 2014). In environmental economics, the typical concern is that respondents will overstate their willingness to pay for environmental goods (e.g., Loomis 2011). However, this could be mitigated by using various good practice principles, including consequentiality designs and stated choice questions including multiple attributes, forcing the respondent to make tradeoffs on more than just one margin (Johnston et al. 2017).
In SP studies of the VTT, the VTT is often estimated based on stated choice questions with only travel time and cost as attributes, but sometimes also on stated choice experiments with more than two attributes (Flügel and Halse 2021). Comparing with the results of revealed preference (RP) studies, the typical finding is that the VTT is lower in SP studies (Small, 2012). Shires and de Jong (2009) find that SP values are at least 25 percent lower for non-business trips, while Wardman et al. (2016) find SP values to be about 10-15 percent lower than those from RP studies. A possible reason is that respondents pay more 1 3 attention to travel cost in the SP setting, where costs are explicitly presented, than they do when making real-life travel choices.
Another concern is the perceived realism of the choice task. In many recent studies, the attribute values of travel time and travel costs are centered around actual travel time and cost of a recent reference trip reported by the respondent. This could be expected to increase realism but might lead to a higher degree of reference dependence in the responses (Hess et al. 2020). The impact of this on average estimated VTT is however ambiguous.
Hence, while hypothetical bias has been thoroughly studied in the valuation of environmental and public goods, there is little research to build on to derive expectations related to the effect of hypothetical vs. actual choice situations in the valuation of travel time.

Effects of survey and interview methods
There is a growing interest in the effects of survey method or mode on data quality 3 and results in the general survey methodology literature, more recently also picked up by the SP research community (e.g. Lindhjem and Navrud 2011a;Boyle et al. 2016). Apart from potential differences in self-selection and nonresponse between different survey/recruitment modes, the main concern is related to measurement effects. If similar respondents provide systematically different answers to the same survey depending on whether answering e.g. on or off-site, by email recruitment or through an internet panel survey, a so-called "pure survey mode effect" is present (Lindhjem and Navrud 2011a).
The two main sets of factors through which this effect is believed to occur are related to (1) cognitive or psychological factors and (2) normative or sociological (Dillman et al. 2014). The former is particularly related to potential satisficing behavior, i.e. shortcutting the thought process and providing a suboptimal response. The latter factor is especially related to social desirability bias, i.e. the tendency of survey respondents to answer questions in a manner that will be viewed favorably by others (e.g. an interviewer). 4 The rise of internet panels in survey research has raised concerns over the extent of satisficing behavior and potentially lower quality of internet panel data. A key concern is "professional respondents" answering a large number of surveys fast and with little effort, primarily for the small monetary incentive they typically receive Hillygus et al. 2014). It seems, however, from recent research that data quality may not be systematically lower in internet panels compared to other types of survey modes, at least for a large share of respondents, indicating that satisficing behavior may not be more prevalent among average internet panelists (Zhang et al. 2020). This is the case also when considering that many such respondents answer on smartphones or tablets while traveling, exposing themselves to potentially higher level of distractions (Wenz 2019;de Bruijne and Oudejans 2015;Skeie et al. 2019;Liebe et al. 2015;Sandorf et al. 2022). 3 Measured in different ways for example completeness (e.g., item nonresponse), accuracy (comparison with external benchmark data, e.g., actual votes), reliability (e.g., psychometric scale properties) and more generally comparing response distributions of key constructs under study. In SP, indicators such as the tendency to choose status quo or other alternatives and the scale parameter (error variance) (for stated choice experiments) (Sandorf et al. 2020(Sandorf et al. , 2022, zero, protest, 'don't know' responses and variance in willingness to pay (contingent valuation) have been used (Lindhjem and Navrud 2011b). 4 Since social desirability bias is more frequently observed in settings with interviewers and since the VTT is a relatively neutral topic (compared to e.g. valuing public goods such as protection of the environment), this effect is unlikely to be important in our context. A recent study of internet panel data quality found that experienced opt-in internet panelists are more likely to report "for money" as their main motivation, indicating that they trade off their time for very small monetary benefits (Zhang et al. 2020). The same study also finds, somewhat surprisingly, that these more experienced ('professional') respondents tend to provide answers of at least as high quality as other panelists who are less motivated by money. In the SP literature, most studies to date have found small to moderate effects on data quality and estimated monetary measures when comparing internet panel data with other recruitment methods (e.g. Lindhjem and Navrud 2011a, b;Boyle et al. 2016). Recent and emerging SP research, however, show that data quality of internet panels may depend on survey experience and degree of 'professionalism' among respondents (Sandorf et al. 2020(Sandorf et al. , 2022. This suggests that even if some concerns about the effects of internet panel recruitment in SP have been put to rest, such effects are still likely to be context dependent, as pointed out by the recent SP guideline by Johnston et al. (2017;p. 340). Furthermore, they might be of particular relevance when estimating a monetary measure such as the VTT, since this measure is so directly related to respondents' opportunity cost of time.
Even if the internet panel effect on indicators of data quality and rational responses may be small or moderate in SP surveys, there may still be substantial and economically significant effects on monetary measures estimated based on such data. As hypothesized here, this can be due to unobservable differences in the underlying preferences of (professional) internet panelists compared to the general population (or more precisely: respondents recruited by other means). While other SP studies have found similar or somewhat lower welfare measures from internet panel data compared to other survey modes (e.g. Lindhjem and Navrud 2011a;Boyle et al. 2016), these are not directly comparable with VTT. 5 Further, the general context-dependence of survey mode effects means that it is difficult to generalize regarding the relative magnitude of the different underlying mechanisms that are driving the effects. In the following, we will investigate indicators of both data quality and effects on VTT. Table 1 below gives an overview of recent national VTT studies in Europe, including information on how respondents were recruited and how interviews were carried out. 6 As we can see, most studies only relied on one source of recruitment and one type of interview, at least per traveler segment. Also, source of recruitment and type of interview might be bundled. 7 However, there are a few studies that exploit within-study comparisons in survey method. Börjesson and Algers (2011) use data from different samples collected in the 2007-2008 Swedish VTT study and investigate the relationship between interview method and average 5 Lindhjem and Navrud (2011a) believe some of this effect may be due to more honest responses and less social desirability bias in internet panel surveys giving lower willingness to pay responses. 6 Note that 'intercept' could imply either that respondents answer the survey right away on the location where they are recruited (on-site) or that they leave their contact info or receive a link to the questionnaire and answer it later. 7 In the 2012 German study, the non-business respondents who were recruited by phone were asked if they wanted to complete the survey on-line. However, only 5.6 percent of the respondents choose to do so which does not allow for a sensible test regarding the effect of interview type (Dubernet and Axhausen 2020). In the 2014 UK study, no results on the effect of source of recruitment or interview type are reported (Batley et al. 2017).  Axhausen et al. (2006; Denmark (2004) Internet panel Web-survey Fosgerau et al. (2007) Phone panel Personal interview Sweden (2007 Number plate register (2007) Telephone interview Börjesson and Eliasson (2014) Population register (2008, car) Web-survey or telephone interview 1 3

Survey and interview methods in VTT studies
VTT as well as error variance in the choices made by respondents. They find that error variance is larger in the data from telephone interviews than in the data based on an online questionnaire. This seems to be related to characteristics of the interview situation itself, like time pressure. The authors also find a 45 percent lower VTT in the telephone sample, but the difference is only 9 percent when controlling for socio-economic characteristics. Hence, the authors conclude that this effect can largely can be attributed to differences in socioeconomic characteristics. We should have in mind that this study is from a time when internet use was probably more highly correlated with socioeconomic status. Among other VTT studies not considered national, Lu et al. (2018) use both internet recruitment and a roadside interception as methods of recruitment for overlapping travel purposes/groups (as well as recruitment by phone and through Facebook). In this study about the Copenhagen Harbour Tunnel (a new toll tunnel), little evidence is found that the survey recruitment methodology impacted the resulting VTT. Covariates are included in the models to investigate whether the VTT varied by the different survey data collection methods. These covariates are found to be insignificant in the final models. The internet panel used was that of Kantar Gallup. The researchers hint at the possibility that the panel's positive outcome may be a result of the high quality of the Kantar Gallup panel.
There is also one study that exploits differences in interview location, similar to our study. Hanssen (2012) analyzes data from an SP survey among ferry passengers in Norway, where some of those who were recruited onboard the ferry answered on-site and some answered later in another location. In addition, some were recruited via mail. Somewhat contrary to expectations, Hanssen finds that those who answered on-site have a 23 percent lower value of travel time, but a 31 percent higher value of service frequency than the two other samples. However, it is not clear to what extent this can be interpreted as an effect of interview location, or whether it also picks up an effect of recruitment method.

Evidence from the Dutch VTT study
In this section, we present evidence on the effect of internet panel membership from the 2009-2011 Dutch study (Significance et al. 2013;Kouwenhoven et al. 2014), including some evidence based on new analysis of the data. This study is of particular interest because of the substantial effect of recruitment mode that was identified and the adjustments that were made to the data collection and survey design to account for this.
There were two different recruitment methods in this study, giving variation also within travel mode and trip purpose. All respondents of the 2009 survey were members of the PanelClix internet panel, a large opt-in panel. Respondents received a reward equivalent to € 1,50 for the completion of a 15-20-min questionnaire. However, the resulting VTTs were implausibly low. Therefore, in 2011 a survey with intercept or 'en-route' recruitment was carried out, using the same recruitment methods used in the previous national VTT studies in 1988 and 1997.
When these two datasets are analyzed using simple Multinomial Logit (MNL) models and correcting for inflation, the VTT from the internet panel (i.e. the 2009 respondents) is typically between 40 and 60 percent of the VTT from the respondents recruited in the field (i.e. the 2011 respondents), as can be seen from Table 2. When corrections are made for differences in socio-economic classes, the VTT-differences are reduced by about 10 percent. When more advanced MNL models (incorporating non-linear dependencies on the current travel time and cost and (especially) on the time and cost differences offered in the SP experiments) are used, the VTT from the internet panel respondents is still 10 to 20 percent lower than from the respondents recruited by the intercept method.
These results are in line with another finding from the same survey (Significance et al. 2013), shown in Table 3: In the 2011 survey, respondents were asked an additional question (that was not present in the 2009 survey) of whether they were a member of an internet panel. This information was used as an interaction variable for the VTT. The model estimations indicate that commuter respondents in the 2011 survey (i.e. recruited at intercept locations) who were members of an internet panel had on average a 20 percent lower VTT compared to commuter respondents that were not a member of any internet panel. For business respondents, VTT is 13 percent lower. For respondents traveling for other purposes, there is no significant difference. In our survey design (see Section "Recruitment method and data collection"), we replicate this feature of the Dutch 2011 survey, which allows us to identify the effect of internet panel membership among respondents not recruited from an internet panel.
The Dutch study concluded that the 2009 internet panel survey led to substantially lower VTT than the 2011 intercept survey. Those for 2011 are much more in line with the values found based on the surveys in 1988 and 1997, which had always been regarded in The Netherlands as very plausible values by the various transport sectors, and are not considered to be particularly high in an international perspective: The meta-analysis of Shires and de Jong (2009) found comparable or higher values for many other Western countries compared to the Dutch values from 1997. The most likely conclusion therefore is that the 2011 values are correct and that the 2009 values are biased downwards. In other words, that the sample from the internet panel is less representative in terms of VTT.

Survey design and data collection
Our main empirical contribution is based on data from the recent Norwegian valuation study on personal travel, which included a stated preference survey on the value of travel time. In this section, we give an overview of this data, focusing on the parts that are most relevant to the effects of recruitment method and interview location. For more documentation, see the technical report from the project (Flügel et al. 2020).

Survey and choice experiment design
The survey consists of several stated choice experiments with various attributes. In this paper, we use data from the two-attribute experiment illustrated in Fig. 1, which only includes the attributes travel time and cost. Respondents using all transport modes (except walking and cycling) participate in this experiment. They are instructed to imagine two different travel options (routes), where in both options the mode of transport and all other trip characteristics are the same as on an actual reference trip reported by the respondent. In eight consecutive choice tasks, respondents choose between two alternatives with difference values of travel time and costs. 8 The attribute values are pivoted around reference values of travel time and cost are from the reference trip. To avoid confounding factors, no further explanation is given regarding the reasons for differences in travel time and cost. 9 Each respondent faces two choice tasks of each the following types: (I) Cost and travel time is equal to the reference value in one alternative, while the other alternative is more expensive and faster. (II) Cost and travel time is equal to the reference value in one alternative, while the other alternative is cheaper and slower. (III) One alternative has cost equal to the reference value but is faster. The other alternative has travel time equal to the reference value but is cheaper. (IV) One alternative has cost equal to the reference value but is slower. The other alternative has travel time equal to the reference value but is more expensive. There are no choice tasks in which one alternative is better in terms of both travel time and cost.
The design is constructed such that all respondents face some situations in which the price of saving travel time is high and some in which it is low, and both small and large  1 3 changes in travel time. To achieve this, eight different percentage changes in travel time are drawn from eight different intervals and combined randomly with eight different prices of travel time ('bids'), which are also drawn from intervals. Based on each combination, the difference in travel cost is calculated. The change in travel time ranges from 10 to 30 percent, and the bid ranges from 10 NOK/hour to 750 NOK/hour. 10 The resulting differences in travel time and cost are allocated randomly across the choice tasks of type (I-IV) described above. The statistical design is identical to the design used in the previous Norwegian national value of time study (Ramjerdi et al. 2010), except that the bid range has been adjusted somewhat. For the respondents in our sample, the questionnaire contains the stated choice experiment described in the previous section followed by another experiment which we do not consider here. 11 Before the stated choice experiments, the respondent is asked a number of questions regarding a reference trip which is used as input to the choice tasks. This includes questions about reference travel time and cost. In addition, respondents are asked to report some background characteristics of themselves and their reference trip. In Section 3.3, we explain how the reference trip is selected.
Like in the Dutch study described in Section "Evidence from the Dutch VTT study", all respondents are also asked whether they are active members of a commercial internet panel. This means that we can identify panel members also among those that are not recruited from the panel.

Recruitment method and data collection
Respondents were recruited from three sources: An internet panel, an alternative email register and field intercept. Originally, the email register was chosen as the main source of recruitment because it was relatively cheap. At the same time, it was decided that a part of the sample should be from an internet panel for comparability with the previous Norwegian VTT study (Ramjerdi et al. 2010). Field intercept was chosen as a supplement specifically to test for the effect of recruitment method.
The data was collected in October and November 2018. For practical reasons, the survey was not launched at the same time using all sources of recruitment, but the samples are overlapping in terms of interview date and time of day (see Section "Heterogeneous effects by recruitment method"). The samples contacted in the internet panel and email register were national without any particular stratification.
The internet panel was provided by Norstat, (norstatpanelet.no), one of the leading panel providers in Norway. 12 Invitations were sent to a representative subsample among the members of this panel until the quota of 3000 interviews was reached. The response rate was 14 percent, which is relatively low but comparable to other surveys based on the same panel (e.g. Navrud et al. 2017;Dugstad et al. 2020).
The alternative email register was provided by a subsidiary of the Norwegian postal service. It contains individuals who have submitted their contact information to the postal service (e.g. when changing address or pausing mail delivery) and agreed to receive invitations to surveys. 200,000 emails were sent, which means that despite a very low response rate (2.7 percent), this recruitment method still accounts for the largest part of our sample. Response rates from such general email sample frames are known to be relatively low (Biemer et al. 2017).
Field recruitment was targeted at travelers by local public transport and passenger boat. The latter group was targeted because passenger boat is a somewhat marginal mode, which means recruiting from a national sample (e.g. an internet panel) will only result in a few boat passengers being included. Those recruited in the field were asked if they want to answer on site (on a tablet or their smartphone) or receive the questionnaire via email and answer later.
Field recruitment and interviews were undertaken in the cities of Oslo, Trondheim and Molde, which have populations of about 670,000, 190,000 and 27,000, respectively. Travelers were approached approximately at random both on board and at stops and stations. In the beginning of the data collection period, respondents were only approached at stops and stations. Later, respondents were also recruited on board, except for on trains, where permission for on-board recruitment was not granted by the train operator. 50 percent of those who were approached by an interviewer accepted to be recruited for the survey. Out of these, 48 percent completed the survey. This yields a total response rate of 24 percent in the field recruitment sample.
Those recruited in the field and from the email register were informed that by participating, they were eligible for a lottery where they could win a gift voucher. 13 Those recruited from the internet panel could not join the lottery, as this would interfere with the reward system of the panel provider. In practice, the expected payoff is of a similar order of magnitude across sources of recruitment.

Reference trip and interview location
One important feature of field interviews in transport research is that one can ask respondents about a trip that they are currently making or just finished. When contacting respondents via email, mail or phone, it is common to ask about a recent trip. A novel feature of our data is that also those who were contacted via email were asked if they were currently traveling. This implies that we have variation in interview location within all three sources of recruitment. This is illustrated in Fig. 2. Those who are recruited from the internet panel or email register are asked if they are currently on a trip when answering the survey. If they answer yes, this trip will be their reference trip and the interview context will also be "Traveling". If they answer no, they will be asked to report their recent trips, and a reference trip will be picked from these. In this case, the interview context will be "at home, office etc." Those recruited in the field are asked if they want to answer on site or receive the questionnaire via email. If they choose to answer on-site, the trip that they are currently making or just completed will be their reference trip and the interview context will be "Traveling". If they choose to receive the question via email, they will be asked the same question as those recruited from the internet panel or email register.
Among those in the "Traveling" group, some will have finished their trip and some will still be underway. A potential concern could be that those in the latter group are not able to predict their travel time. (No explicit instructions are given regarding this in the questionnaire.) Although we expect that most respondents would be able to give a reasonable estimate of this, we investigate in Section "Heterogeneous effects by recruitment method" how travel time differs by interview context. We also control for the difference in travel time between alternatives in the analysis in Section "Empirical modeling".
The interviews of those who are recruited in the field and answer while traveling differ slightly from those who chose to answer in another location in the sense that an interviewing assistant is present in the recruitment, and in some cases also during the interview. However, respondents receive little guidance on how to answer the choice questions and the rest of the questionnaire. In this sense, all interviews can be regarded as unassisted. A more important effect of the presence of an interviewer is probably that they influence who chooses to participate in the survey, which we will explore in more detail below.

Recruitment method and interview location
The total sample used in this paper contains 7160 respondents. 14 Among these, 37 percent are from the internet panel, 55 percent are from the email register and 8 percent are recruited in the field. Figure 3 shows the answers to the question about whether the respondent is an active member of an internet panel. As expected, the share of panel members is by far highest among those recruited from the internet panel, although a few of these apparently do not consider themselves as members. More importantly, there is a non-negligible share of panel members also in the two other samples, particularly the sample recruited from the email register. 15 There is also a notable share reporting that they do not know whether they are panel members, particularly in the field sample. This means that we can not only study differences in VTT between the three recruitment samples used in this survey, but also the effect of internet panel membership more generally. Figure 4 shows how many are actually traveling while making their choices, which means that their current trip will be the point of departure for the stated choice experiment. Most of the respondents are not currently traveling, but instead answering questions based on a recent trip. The share who is currently traveling is largest among those recruited in the field, which is expected since some of these answer on-site. However, there is also a nonnegligible share of currently traveling respondents in the two other samples. This means that we can estimate an effect of interview location, controlling for recruitment method.
Among those recruited from the internet panel or the email register, travelers by public transport are most likely to be on a trip when answering the survey, followed by air travelers, 16 car passengers and car drivers (presumably not while driving). This is as expected, since those traveling by public transport are more likely to check their email. In the empirical analysis in Section "Empirical modeling", this will be taken into account by controlling for travel mode as well as other characteristics of the reference trip.

Respondent characteristics
Before we estimate the effects of survey recruitment method, internet panel membership and interview location on the VTT, we explore whether respondents from the different 15 A likely explanation for this difference is that receiving an e-mail invitation to a survey is similar to receiving an invitation to a panel or a survey invitation from a panel, and therefore partly attracts the same group of respondents. 16 We over-sampled long-distance trips (70 km or more) by asking respondents whether they had undertaken such a trip during the last two weeks. If the answer was yes, the majority of respondents would be assigned the last such trip as their reference trip. A large share of trip by airplane will be in this category. As a result of this, most of those answering questions about air travel will not currently be traveling.

3
survey samples differ on average in terms of observable characteristics. This gives evidence on the relationship between survey methods and representativeness, and it will guide us regarding which factors need to be controlled for in the regression models in Section "Empirical modeling". Table 4 shows observable characteristics of respondents by source of recruitment. This includes time of interview, which was recorded in the survey software. For the sake of brevity, we only report average values. As expected, respondents recruited in the field are on average much more likely to be traveling in or from the two major cities Oslo and Trondheim, where field recruitment was carried out. This sample also has a substantially higher share of female respondents 17 and shorter reference travel time. There is not much difference in average travel time between the panel and email register samples.
We also see that panel respondents are on average older and have lower income and education, also compared to those recruited from the email register. They are also much Fig. 3 Share of respondents who report that they are members of an internet panel, by recruitment method Fig. 4 Share of respondents who are currently traveling when answering the questionnaire, by recruitment method and transport mode used on the reference trip. Note: Car and air travelers who were recruited in the field are excluded due to very small samples 17 A partial explanation for this could be that field recruitment was targeted at public transport, which is more frequently used by women. However, the difference still seems large.
more likely to answer the survey during the weekend or in the morning. 18 Apart from that, the interviews in the three samples are on average conducted during almost the same week, but with some differences as expected. 19 Table 4 also compares the characteristics of our sample to the general population. We see that our sample is somewhat younger on average, which might be as expected as younger people travel more. However, income is also lower and education level substantially higher in our sample. The share of trips that take place or start in Oslo is also higher than the corresponding population share, but this might be because many people from around Oslo travel in and out of the capital.
In Table 5, we compare respondents based on whether they report to be members of an internet panel. Here, we do not include those who were recruited from the internet panel. We see that those who are active members of an internet panel are older, have lower education and are less likely to be traveling in or from Oslo, similar to the findings in Table 4. 20 They also have somewhat longer travel time. However, they do not have substantially lower income on average. Passive panel members and non-members seem to be more similar. Active panel members also answer the survey slightly earlier (0.25 weeks difference). Apart from this, there are no apparent differences in when the interviews were carried out (day or time). Finally, Table 6 shows how these characteristics differ between respondents that answered while traveling and those who did not ('not traveling'), within the field recruitment sample and the remaining sample (internet panel and email register). In the field recruitment sample, those who are not currently traveling have considerably shorter travel time, but this is not the case in the remaining sample. A possible explanation is that those who are on a short trip and are intercepted by an interviewer are less likely to have time to answer on-site than those who are on a longer trip. We return to this mechanism in Section "Heterogeneous effects by recruitment method".
Apart from this, there are no apparent differences in the field recruitment sample between those answering while traveling and not. In the remaining sample, those who answered while traveling are significantly younger and are also somewhat more likely to be traveling in or from Oslo. They are also somewhat less likely to answer during a weekend and in the evening, which might be because they answer the survey right away. 21 (Email invitations were sent out during daytime on working days.) There is no apparent difference in income, and also not in reference travel time.
The above findings imply that when we estimate the effect of survey recruitment method, internet panel membership and interview location on the VTT, we need to evaluate to what extent our results are robust to controlling for observed characteristics of the respondents. In this respect, factors like age, income, education, geography and when the survey was answered seem to be relevant control variables.

Choice behavior
Before analyzing the effects of internet panel membership and interview location on the VTT, we have a closer look at the choice behavior of respondents in the stated choice experiment. As pointed out in Section "Previous literature", survey method could impact both the average VTT and the variance or degree of consistency in choice behavior. We therefore compare the following indicators across samples: (a) How much time it takes the respondent to complete the stated choice experiment (excluding those that spend more than 15 min 22 ), as recorded by the survey software. (b) Whether the respondent always chooses the alternative displayed on the left-hand side or always the alternative displayed on the right-hand side (c) Whether the respondent makes at least one 'inconsistent' choice in the sense that he or she accepts a bid which is higher than the lowest bid he or she rejects. (d) The difference (in NOK/hour) between the highest accepted bid and the lowest rejected bid. If the difference is negative, it is defined as zero. (e) Whether the respondent always chooses the cheapest alternative (rejects all eight bids) (f) Whether the respondent always chooses the fastest alternative (accepts all eight bids) 1 3 While differences in (a)-(d) can be thought of as mainly reflecting differences in the ability or willingness to engage in the choice situation, differences (e) and (f) can reflect both such differences and differences in average VTT. Other things equal, a higher share that always rejects (e) implies a lower VTT, and a higher share that always accepts (f) implies a higher VTT.
When considering indicators (c) and (d), one should have in mind that choices might also depend on the size and sign of differences in travel time and costs. Hence, a higher number of 'inconsistent' choices does not necessarily indicate low answer quality. Here, we are mainly interested in differences between samples, not measuring overall answer quality.
Again, we only show average values in each subsample. Table 7 shows that average completion time (a) is lower in the sample recruited from the internet panel. We also see that the share that rejects all eight bids (e) is higher in this sample. 23 The differences in the other indicators are less notable. The share that always picks the alternative on one side is slightly higher in the panel sample, but the difference is not statistically significant.
In line with the above findings, those who are not recruited from the internet panel but report to be active panel members are also more likely to reject all eight bids (Table 8). 24 Both active and passive panel members also have a slightly shorter completion time than non-members. Apart from this, panel members do not seem to differ so much from the rest of the sample with respect to other indicators of choice behavior.
Finally, those who answer the survey while traveling are less likely to reject all eight bids and more likely to accept all eight bids, indicating a higher VTT (Table 9). 25 The pattern with respect to other indicators of choice behavior is less clear. 26 Interpreting these results in light of the literature discussed in Section "Survey and interview methods in VTT studies", we find no evidence that internet panels are associated with lower data quality. We do find, however, that those recruited from the internet panel have a shorter completion time. This might suggest that these respondents are more trained in answering on-line surveys, and therefore are able to complete the stated choice experiment faster without sacrificing precision in their answers.

Empirical modeling
In this section, we show the effects of recruitment method and interview location when controlling for other characteristics of the traveler and reference trip as well as design variables.

Estimation model
As explained in Section "Recruitment method and data collection", the choice situation always involves the choice between two travel options of which one is faster and more expensive. Following Fosgerau et al. (2007), we model the probability of choosing the faster and more expensive option ('accepting'), y nt = 1 , as a function of the offered price per time unit saved ('bid'), B nt , and a set of covariates including other characteristics of the choice task, characteristics of the reference trip and characteristics of the respondent.
Hence, the decision rule is: (1) y nt = 1 if ln VTT nt + 1 nt > ln B nt ,   where B nt is the offered trade-off ("bid") defined as the ratio of the absolute difference of the cost and the time attribute: The error terms nj are assumed to be independently and identically (iid) logistic distributed random variables with mean value of zero and variance of 2∕3 We further parameterize the log of VTT with explanatory variables.
where Email nt is a dummy variable for being recruited from the email register and Panel nt is a dummy variable for being recruited from the internet panel. Active nt , Passive nt and Unknown nt are dummy variables for whether the respondent reports to be an active internet panel member, a passive internet panel member or not knowing whether he or she is an internet panel member, respectively. Finally, Nottravel nt is a dummy variable for not being currently traveling when answering the questionnaire. x nt is a vector of control variables. The logarithmic specification implies that the estimated parameters of each covariate approximates the percentage change in VTT resulting from a one unit increase in the value of the covariate. u n represents unobserved taste heterogeneity across respondents yielding a lognormal distribution assumption of VTT. u n is normally distributed over respondents with mean value 0 and standard deviation σ.
Our main interest is in the estimated parameters β 1 -β 6 , which capture the effects of recruitment method, internet panel membership and interview context. To reduce concern that the estimated parameters instead capture some other unobserved characteristics that are correlated with the variables of interest, we compare the result of models including different sets of control variables in x nt : (a) Characteristics of the reference trip and the experimental design: Trip purpose, travel mode, reference travel cost, trip distance, sign of the changes in travel time and travel cost relative to the reference values and absolute value of the difference in travel time relative to the reference value. 27 (b) Socioeconomic characteristics and time of interview: Gender, age, age squared, income, income not reported (dummy), higher education ≥ 3 years (dummy), higher education ≥ 5 years (dummy), education level not reported (dummy), week of interview, interview 6-12 AM (dummy), interview 0-5 PM (dummy), interview 5-11 PM (dummy), interview in the weekend (dummy). (c) Geographic variables: Dummys for Oslo, Trondheim, and missing information Note that by controlling for the sign of changes in travel time and cost, we take into account reference-dependency in the VTT (De Borger and Fosgerau 2008). However, our experimental design (see Section "Recruitment method and data collection") also implies that time (cost) increases and decreases are equally represented in our sample. (2) (3) ln(VTT nt ) = + 1 Panel nt + 2 Email nt + 3 Active nt + 4 Passive nt + 5 Unknown nt + 6 Nottravel nt + � x nt + u n .

Table 10
Estimated effects of recruitment method, internet panel membership and interview location on value of travel time (VTT) Reference case: Respondents who are recruited in the field, are not members of an internet panel and are traveling while answering the survey. Standard errors in parentheses *p < 0.1, **p < 0.05, *** p < 0.01 (1) (3) The model results can also be used to simulate the VTT distribution of the estimation sample or an alternative sample. In this paper, we do not focus on the average VTT or the distribution, only on the effect of recruitment method, controlling for other factors. Table 10 shows the effects of recruitment method, internet panel membership and interview location based on different model specifications. All models are estimated on 54,060 stated choices from 6,973 respondents. 28 The full model results are presented in the Appendix.

Effects of recruitment method, panel membership and interview location
Column (1) shows that those recruited from the internet panel have a substantially lower VTT than those recruited in the field, while those recruited from the email register are somewhere in between. However, when controlling for panel membership among all respondents as well as interview location (column 2), the effects of recruitment method in itself become much smaller and is no longer statistically significant.
In line with the Dutch evidence, we find that active members of an internet panel have lower VTT (columns 2-5). The difference is about -11 percent relative to being a nonmember ( 1 − e −0.12 = 0.113 .) and is robust to controlling for sociodemographic characteristics, geographic location and unobserved heterogeneity.
Perhaps most interestingly, we find a strong and negative effect of answering the questionnaire while 'not traveling', i.e. not while (or directly after) making the reference trip which is the point of departure for the stated choice experiment. The estimated effect is about -17 percent ( 1 − e −0.184 = 0.168 ) and is highly robust to changes in the model specification. One possible explanation is that respondents perceive the travel experience as more positive in retrospect than when they are currently making the trip (Mitchell et al. 1997).
The effect of interview location is contrary to the finding of Hanssen (2012) for ferry passengers in Norway. Note however that the effect of interview location reported by Hanssen also partly captures an effect of recruitment method, as most of those who were not interviewed on site (i.e. received the questionnaire via mail) also had not been recruited in the field. This means that there could be other unobserved differences between the samples.

Heterogeneous effects by recruitment method
In the previous section, the effect of interview location was assumed to be the same across all sources of recruitment. One difference between these samples is that in the field recruitment sample, answering 'while traveling' implies that an interviewer is present in the recruitment and in some cases, also during the interview. This could potentially have some effect on the survey process (e.g. West and Blom 2017), though, as mentioned in Section "Previous literature", social desirability bias as a result of this is not likely in a VTT setting. In the panel and email register samples, an interviewer is never present. Also, those who choose to answer on-site could be less busy than those who do not. Therefore, we estimate separate models to investigate whether the effect of answering while 'not traveling', as well as the effect of panel membership, differs between the samples. Table 11 shows that the 'not traveling' effect is negative and statistically significant both in the field recruitment sample and the sample based on the two other recruitment methods. Although the estimated effect is somewhat higher in absolute terms in the field recruitment sample, the difference is not statistically significant. This strengthens the hypothesis that this effect is somehow related to the fact that the respondent is currently not making a trip -and therefore relates less closely to the choice situation described in the questionnaire. In other words, the choice situation may appear more salient for the respondent who is currently traveling. However, other explanations can still not be ruled out.
The effect of being an active panel member is also negative in both samples, but not statistically significant in the field recruitment sample. Hence, we cannot conclude regarding potential differences in this effect. The lack of precision in the field sample might be due to the lower sample size combined with the fact that the share of active panel members is very low in this sample. However, the point estimate is still negative.

Discussion
Our results provide new evidence on the effect of survey recruitment and interview methods on both data quality and estimated values of travel time (VTT). Regarding data quality in itself, we do not find any evidence that this is lower in the sample recruited from an internet panel or among those who report to be internet panel members. This supports previous studies that do not generally find evidence of significantly lower data quality in SP studies based on internet panels (Zhang et al. 2020;Sandorf et al. 2020Sandorf et al. , 2022.
Regarding the level of VTT, our results clearly indicate that both recruitment method and interview location can impact the estimated VTT based on SP data. Those who report to be active members of an internet panel have a significantly lower VTT than those who do not. A likely explanation is that those who are panel members and regularly answer surveys have a lower opportunity cost of time measured in monetary terms. This could reflect both differences in marginal utility of time and the marginal utility of income.
Regarding interview location, we find that those who answer the survey while traveling have a higher VTT, both among those recruited in the field and in the remaining sample. It is less clear whether this effect should also be taken into account if the objective is to obtain representative values. First, since interview location is the result of choices made by the respondent, it could be that this effect captures some differences in unobserved characteristics between those who are answering while traveling and those who are not. Second, it is not necessarily the case that the hypothetical choices made by those experiencing the travel situation while answering are more in line with actual behavior. Since travel decisions are made in advance, it could be that travelers experience the travel time as costlier when they are making the trip than they did when choosing their travel option (time inconsistency).
Our survey design has the feature that those who answer the questionnaire while traveling will always be answering based on their current trip, while those who are not traveling will be answering based on a previous (in most cases recent) trip. Those who answer based on their current trip might exhibit a greater degree of reference dependency in their choices, which has an ambiguous effect on estimated VTT (see Section "Hypothetical choice situations and bias in SP studies"). However, the choice of reference trip could also have other effects. A possible extension in future studies would be to ask some of those who answer while traveling about a previous trip instead of their current trip. This would enable the researcher to identify the effect in itself of answering while traveling, independent of choice context. In our case, these two treatments are bundled.

Conclusion
Our results have potentially strong implications for studies that aim to obtain representative values of travel time for use in cost-benefit analysis. If it is the case that (especially experienced) internet panelists have a lower opportunity cost of time, results that are primarily based on internet panel surveys would need to be adjusted for this panel effect. Based on the findings documented in this paper, the recommended values in the new Norwegian VTT study (Flügel et al. 2020) are based on simulations in which those who reported that they were active members of an internet panel were given a lower weight. 29 If this adjustment had not been made, the new Norwegian VTT values would have been low compared to the values based on the 2009 study. This is the case even if we do not assume that the VTT increases over time at the same rate as the income level, but assume a more moderate growth as advocated by ITF (2019). Interestingly, internet panel was the main source of recruitment in the 2009 study. Since the two studies were methodologically very similar, it might seem like there is a negative effect of internet panel membership in 2018 relative to 2009. One possible reason for this is that the representativeness of internet panels has changed over time, at least with respect to this particular purpose of obtaining VTT. However, alternative explanations cannot be ruled out.
If it is the case that internet panel members have a lower opportunity cost of time than the general population, it might also be the case that those who respond to the survey (regardless of recruitment method) have a lower opportunity cost of time than those who choose not to respond. This kind of non-response bias (Groves and Peytcheva 2008) would imply that all survey-based VTT estimates are likely to be conservative. The magnitude of this effect is of course not possible to assess by use of (any) survey data, but a potentially testable implication is that the bias increases as the response rate declines.
Regarding the effect of interview location, several explanations are possible. We recommend investigating this further based on an experimental approach, where respondents are allocated randomly into answering while traveling or not while traveling. The effect of other treatments that make the choice situation more salient could also be explored. In any case, the strong effect found in this paper means that the issue should be taken seriously and calls for more research on the topic.
On the other hand, it is difficult to see how the effect of internet panel membership (or voluntary survey participation more generally) could be verified in an experimental framework. Here, our hypothesis is that the voluntary nature of panel membership is exactly what drives our result. Instead, we recommend to further investigate the relationship between survey response rate and VTT, more specifically whether a lower response rate implies a lower VTT due to self-selection. Meta-analysis would be a suitable approach for such an investigation, which could include both SP and RP surveys. However, one should be aware that in many studies, recruitment method (and possibly response rate) and interview location cannot be disentangled.
It is too early to conclude on what the implications for studies that aim to estimate a representative VTT will be. If the panel effect is driven by self-selection, all survey-based VTT estimates (both SP and RP) could potentially be biased downwards, but methods that achieve higher response rates might mitigate this problem. If the higher VTT of those who are currently traveling reflects that SP choices made by respondents who are currently traveling are closer to RP choices, this calls for concern regarding the validity of SP surveys more generally.
For researchers and practitioners, designing surveys that are less vulnerable to the issues discussed here might involve some costs. Field recruitment requires more manpower than email recruitment, but this could be worth the cost if it results in a less self-selected sample. At the same time, our results illustrate that field recruitment is not strictly necessary for being able to interview people while they are traveling. In the smartphone era, travelers are only an e-mail or text message away.