Introduction

Surveys have for decades been useful research tools in environmental criminology both in urban (e.g. Nourani et al. 2020; Tseloni et al. 2018; Van Dijk and Steinmetz 1983a, b) and rural environments (e.g. Harkness et al. 2022). Yet, there has been limited knowledge about the potential qualities and/or challenges they may present, especially when traditional paper surveys are compared with web-based surveys. Since research methodology in environmental criminology is an evolving field, there is a need to report the results of such comparisons of survey instruments. For instance, whether survey respondents provide similar responses when using different survey formats or when the goal is to identify which method can be less vulnerable to bias. This information can contribute to the ongoing development of best research practices when using surveys, consequently improving the quality of research findings with clear implications for safety interventions and practice.

The rapid advancement of technology in research, in particular the advent of web-based surveys, has significantly changed how people interact with surveys (Evans and Mathur 2005). Traditional surveys may be paper-based or administered face-to-face, while web surveys are conducted online. These differences can affect the validity of responses due to varying modes of interaction (Neuman 2012) and there have been expectations that web surveys could be less demanding and decrease the respondents’ perception of the burden of answering the survey, although this has not been confirmed (Haas et al. 2021). To check for the consistency of web-based surveys and traditional surveys, researchers may compare data collected through web-based surveys with data collected through traditional methods (e.g. paper surveys). By doing this, researchers can assess the validity of both methods for capturing real-world phenomena or measuring constructs of interest. It is equally important to assess the internal consistency of the surveys, which tells us how well the items within a survey are related to each other or how consistently respondents reply to questions that are framed slightly differently.

This article aims to compare the pattern of responses obtained by two semi-identical surveys (a web and a paper survey) used to investigate the transit safety of travellers at railway stations in Sweden. To achieve this aim, we first assess the response rates within the surveys, we check whether the response rate changes as the respondents progress through the survey, and if so, how it changes. Then, by observing how the respondents answered in different categories of the survey we explore how the survey mode affects the respondents’ answers after we have checked for differences in the sample, and if so, how. It is particularly relevant to assess whether the act of writing the answer on paper makes respondents less willing to finish the survey. Finally, we also investigate whether the order of the alternatives affects respondent answers when comparing two surveys. Here, we compare the order of the paper survey, which is fixed, and the web-based survey which is automatically randomised for each respondent. To carry out the study, a comparable sample of responses was selected from each group of respondents. Two matching sub-samples (of 500 responses each) were created to account for variations in each sample’s characteristics that could have an impact on the outcomes.

Surveys in criminological research

The utilisation of surveys is a common research method for collecting data and gathering information from individuals or groups of people (Rossi et al. 1983). They can serve various purposes, but are often designed to gather specific data on a particular topic or research question when the available knowledge of the issue is limited (Forza 2002). Surveys allow researchers to collect information directly from respondents, providing insights into their opinions, attitudes, behaviours, or experiences. Surveys are widely used for decision-making, data collection, and research in various fields, for example, social sciences, market research, and health care (Safdar et al. 2016), to name just a few. In criminology, victimisation surveys have been used for decades to learn about about people's experiences as victims of crime. These surveys offer a source of victimisation data in addition to police-recorded crime statistics (Hough and Maxfield 2007). They also offer significant additional data on crime (especially data otherwise not reported to the police, such as data on domestic violence), such as the frequency of which crimes are reported to the police, the fear of crime, and the employment of crime prevention measures and indicators of police trust. In order to address the issue of crimes that are not reported to the police, the first victimisation surveys were conducted in the 1960s (Heiskanen and Laaksonen 2021). During the coming years, victimisation surveys were carried out in almost every western country (Block 1984; Fattah 1981; Skogan 1976). The majority of this research used the United States' National Crime Survey as a model, where the primary objective was the gathering of information on the amount of crime that took place in different areas according to crime type (Dijk and Steinmetz 1983a, b).

A survey can be carried out in different ways, such as through post, face-to-face where an interviewer asks the questions, via e-mail, telephone calls, on the web, or personally by handing out paper questionnaires. There are also several strategies for reaching the target audience of a web-based survey. These may include reaching out to a random or representative population, extending mail invitations or advertisements, or using crowdsourcing platforms where respondents typically receive compensation for completing the survey. Thus, the survey method and distribution procedure can have an effect on the response rate, respondent answers, turnaround time, costs, sample characteristics, and data quality (Bachmann et al. 1996; Dillman 2011; Fang et al. 2021; Kelfve et al. 2020; Kwak and Radler 2002; Sproull 1986; Yun and Trumbo 2000). In the list below, we explore a range of survey methods, highlighting their respective strengths and weaknesses (Alderman and Salem 2010; Jones et al. 2013; Vaske 2011).

  • Face-to-face surveys In this method, interviewers directly ask respondents questions. This enables the use of more complex question styles and follow-up questions, while also offering the chance to establish a connection with the participant, which could result in increased response rates. However, it may require a significant amount of time, money, and is at risk of being influenced by interviewer bias. Survey participants might also experience discomfort when talking about sensitive subjects in person.

  • Paper-based surveys Participants receive questionnaires, which they fill out and return. This method offers flexibility to respondents and versatility as it can be used in various settings without any technical recourses. Nevertheless, the possibility of data entry mistakes exists when answers are inputted or converted into a digital form. It can also be expensive and time-consuming with limited reach.

  • Telephone surveys These surveys involve interviewing people via telephone. They offer cost-effective and wide-reaching data collection and can be quick to perform as there is no need for travel. Yet, they could face challenges such as low response rates, limited sample representativeness, and potential interviewer effects or non-response bias.

  • Postal questionnaires Participants receive questionnaires by mail and return them via post. This approach is also affordable and facilitates widespread distribution, while also giving the participant the flexibility to finish the survey at their own pace and convenience. Low response rates, possible delivery problems, and long turnaround times are obstacles because individuals require time to receive, fill out, and send back the survey materials via mail.

  • Web-based surveys Surveys performed on the web are often cost-effective, as they do not require expenses for printing and postage. Additionally, it is a fast and effective technique that can be implemented easily and streamlines data collection procedures. The disadvantages could be that they miss groups that are not proficient in working with online platforms (e.g. older adults) or for different reasons cannot access the survey (e.g. in remote areas with poor internet access).

  • With active sampling Researchers approach a random or representative population online to participate in surveys. Active sampling means that specific individuals or groups are selected, ensuring better sample representation (e.g. YouGov). They are also efficient, cost-effective and have a big reach.

  • Through mail invitations or advertisement The survey is distributed through broad mail invitations or advertising to encourage people to fill it out online (on for example SurveyMonkey or Crowdsignal). This approach ensures high accessibility but could lead to self-selection bias and limited control over sample demographics. This method is assessed in the study.

  • Through crowdsourcing platforms Crowdsourcing platforms are utilised to recruit respondents who are compensated for completing surveys (e.g. MTurk or Clickworker). This approach allows for inclusion of various participants and is also cost-efficient. The downsides are that the quality of data and the representation of the sample may differ, and concerns regarding the motivation and engagement of participants.

In this study we compare the results from a paper-based survey with a web-based survey through mail invitations or advertisement.

Research on the survey mode and the effect on responses

The literature is split between those who show evidence that web-based surveys could replace traditional paper ones with minor effects on response rates and lower costs (e.g. Hohwü et al. 2013; Kaplowitz et al. 2004) and those who argue that the survey mode can have a notable effect on the responses, sometimes higher, and sometimes lower response rates.

While the response rate is the percentage of valid responses received for each individual survey question and may fluctuate when respondents choose to skip certain questions and answer subsequent ones, the completion rate is the percentage of participants who answered all questions in the survey. Therefore, it may vary (or remain constant), as respondents progress through the survey. Response rates, in particular, are important because they directly affect the estimated prevalence rates on which public policies are based. In criminology, for example, Laaksonen and Heiskanen (2014) published a study in 2014 that explored the differences of using three different survey methods (web, telephone, and face-to-face) to collect information on victimisation and crime-related issues. They show that in comparison with the telephone and face-to-face modes, the web survey method consistently produced higher estimated prevalence rates for fear and property crimes. This shows that when respondents are given access to an online, self-administered platform, they may be more likely to disclose information about victimisation than they would do when using other modes. The authors also found that estimates of the occurrence of violence tend to be lower in telephone interviews. This pattern was associated with a feeling of greater privacy in modes where respondents answer questions independently. This is consistent with prior research and aligns with other studies which also found that most participants generally prefer computerised questionnaires over paper surveys or face-to-face interviews, as they tend to feel more comfortable answering questions about socially sensitive behaviours on a computer (Davis Jr et al. 1992; Davis Jr and Morse 1991; Turner et al. 1998; van den Berg and Cillessen 2013).

Studies in other fields can show different evidence. McCabe et al. (2006) found few significant differences between survey modes: their results suggest that web and mail surveys provide comparable estimates of alcohol use in a non-randomised mixed mode design. Patrick et al. (2022) found that while response rates and substance use estimates were not significantly affected, the mode of response was influenced by sociodemographic factors such as race, smoking habits, marital status, and education level. Reported substance use prevalence did not significantly differ according to survey mode after adjusting for sociodemographic characteristics.

Yun and Trumbo (2000) detected that there were a number of potentially important differences in response characteristics in a survey depending on whether it was carried out by post, e-mail, or on a website, but they found that these differences did not greatly influence their analyses. They concluded that the differences detected in the response groups indicate that using multi-mode survey techniques improved the representativeness of the sample without having any bias on other results. Other studies found that paper surveys have lower non-response rates (when respondents skip or leave certain questions unanswered), but that respondents tend to give longer open-ended responses in web surveys due to ease of typing (Kwak and Radler 2002). Two recent studies show new evidence. Roberts et al. (2022) found that compared to mobile browser respondents, app respondents were less likely to drop out of the study, which indicates that there might be differences even within the modes of answering digital surveys. Haas et al. (2021) found no difference in respondents’ perceived response burden between web surveys and paper surveys.

Some of these studies have reported greater response rates in favour of web surveys (Kelfve et al. 2020; McCabe 2004; McCabe et al. 2002; McMaster et al. 2017; Wygant and Lindorf 1999). Web surveys may enable respondents to participate more actively because they can be finished at their convenience and from any location with an internet connection, whereas paper surveys must be physically completed and returned, which can take more time and be less convenient. Other previous studies have identified greater response rates with paper surveys when compared with web surveys (Bason 2000; Kennedy et al. 2000; Kwak and Radler 2002; Messer and Dillman 2011; Shih and Xitao 2008). For example, Fang et al. (2021) compared a web-based survey with a traditional survey questionnaire (face-to-face questioning) in paediatrics to show that the web-based survey had a significantly lower response rate—the web-based survey had an effective rate of 70%, while the completeness rate of the traditional questionnaire survey was 86%. However, they also found that the output of the web-based survey was unaffected by the various data sources, which indicates strong internal consistency. Moreover, the response rate of various survey modes may also differ depending on the target group. Kelfve et al. (2020) found that while the web survey method resulted in a higher overall response rate, it revealed that certain demographics, including older individuals, particularly women, those who were not married, had lower levels of education, or were not employed, were less inclined to respond to web-based questionnaires.

Shih and Xitao (2008) performed a meta-analysis on 39 studies that compared response rates from web and mail surveys (paper) and found that 22 of them favoured the paper-based survey, 12 the web-based survey, and 5 that found no significant difference in response rate between the survey modes. We updated the review by adding studies after 2008 (Bason 2000; Fang et al. 2021; Hohwü et al. 2013; Kaplowitz et al. 2004; Kelfve et al. 2020; Kennedy et al. 2000; McCabe 2004; McCabe et al. 2006; McMaster et al. 2017), see Appendix 1 (Table 8) for details. Out of 47 studies that compared response rates between paper and web surveys, a majority (57%) found that the paper-based survey performed better than the web survey (32%). Only 11% found no significant difference between the two survey modes (Fig. 1).

Fig. 1
figure 1

Source: Based on Shih and Xitao (2008) updated by authors

Survey mode with higher response rate according to previous studies, n = 47.

Impacts of order of alternatives on survey questions

The order of alternatives in multiple-choice questions is a well-known factor affecting how the respondents answer (Ferber 1952; Marks et al. 2016). This phenomenon may cause order bias in surveys and highlights the importance of how choices are presented in surveys. Previous research has discovered that the sequence of alternatives can significantly influence participant responses. Back in the 1950s, a study by Ferber (1952) investigated potential bias in sample surveys caused by the order of questions or alternatives. It examined how the order of occupations in a questionnaire affected respondents’ credit ratings using two questionnaire forms (Form A and Form B), finding that respondents who received Form A were stricter in their ratings, assigning fewer “good” and more “poor” ratings compared to those who received Form B.

Another study investigated the effects of the order in which names are listed on peer nomination rosters in sociometric research, and found that peer nomination counts were significantly influenced by the order in which names appeared on the rosters (Marks et al. 2016). Earlier listed names received more nominations for specific sociometric criteria. According to the study, name order significantly affected affective and relational variables, such as friendship and acceptance, and accounted for more than 5% of their variance. Participants were more influenced by name order when there was less agreement among peers regarding the criteria, as shown by the stronger effects of name order for variables with lower internal reliability (Marks et al. 2016). Similarly, another study on peer nominations in middle-school settings found that long lists can introduce bias, with higher-ranked names receiving more nominations (Poulin and Dishion 2008). However, there have been studies that did not find any significant negative name-order effects (alternatives listed earlier receive more selections), contrary to previous research findings (Liu et al. 2024). The authors also suggested that the lack of significant name-order effects in their study does not definitively resolve the issue, particularly for longer rosters.

Research design

Research questions

Using the current strand of research on the area and the responses from our two surveys, we cast light on the following research questions:

  1. (1)

    How does the response rate change as the survey progresses? And how does the completion rate differ?

  2. (2)

    Does the survey mode affect the respondents’ answers after we have checked for differences in the sample? If so, how?

  3. (3)

    Does the order of the alternatives affect respondent answers when comparing two surveys (a paper survey that is fixed and an internet survey that is randomised). If so, how?

Data collection

In this study, we have used two survey methods to investigate the transit safety of travellers at railway stations in Sweden. We conducted a web-based survey from May to November 2022, while a traditional questionnaire survey was conducted in May–June 2022. Based on the methods and sources, the participants were divided into the following two groups: (a) web-based survey from a web source, (b) paper questionnaire survey.

The web-based survey was created using a web platform (Crowdsignal in Wordpress). The survey was directed at train travellers living in the municipalities of the study area where the 47 stations were located. It was distributed using email lists, social media, local Facebook groups, and webpages of the municipalities concerned.

In order to promote the survey, posters and cards were set up in a number of stations with a QR-code that could be accessed using a mobile phone directing the person to the web survey (Fig. 2). The researchers also participated in radio programmes promoting the research project and encouraging people to answer the survey. The survey was open from May 2022 to November 2022 following the approval by the Swedish Ethical Review Authority.

Fig. 2
figure 2

Poster used for inviting passengers to answer the survey via the internet

The traditional face-to-face survey was conducted by the investigators through face-to-face paper questionnaires at the stations and onboard the trains. Informed consent was obtained orally from the respondents before the investigation. The survey was conducted during May 2022 with a supplementary session in August 2022.

The demographics of the survey sample can vary depending on the type of survey, as demonstrated in earlier studies. Two matching sub-samples were created in order to account for variations in each sample’s characteristics that could have an impact on the outcomes. These had a similar structure, and each had 500 responses. They were based on a set of background variables such as gender, sexual orientation, age, country of birth, income, disabilities, and station size. All these variables have demonstrated an impact on factors such as safety perception and victimisation to varying degrees in our analysis. For instance, women consistently reported higher levels of fear and experienced more victimisation, while men generally felt safer. Similar patterns were observed among younger individuals, those with disabilities, as well as those travelling from smaller stations. In Table 1, the sub-samples of each survey type are placed side by side with reference to these variables. There were no significant differences between the samples in these background variables according to chi-square tests.

Table 1 Description of the sub-samples from each survey type, N = 1000

Methods of data comparison

Evaluating change in response rate during the progression of the surveys and their completion rates

For the response rate, the total number of valid responses was recorded for each question in both survey modes. These responses were then compared to the initial count, providing a basis for calculating the percentage change and revealing shifts in participant engagement. Positive changes indicated an increased response rate, while negative changes indicated a decrease in the response rate, suggesting challenges in comprehension or interest. The percentage change in the response rate could then be plotted on a graph for visual comparison. The completion rate was calculated by counting the number of completed surveys divided by the number of survey respondents.

Assessing the impact of the survey mode on respondents’ answers

In order to assess whether the survey mode has any influence on the answers of the respondents, different categories of questions were created. We combined the answers within these categories (coded as 1 for a positive response and 0 for a negative response) to create summary scores, or indices, which could then be compared between the two survey types to assess the differences. The following indices were created for comparison in further analysis: Victim of crime, witness to crime, fear of crime, variables affecting safety, safety precautions, and recommendations. These are composite variables that summarise responses to a group of related questions or items. Each index represents a different category related to crime and safety, such as “Victim of crime”, “Witness to crime”, “Fear of crime”, etc. As for example, “Victim of crime” includes victimisation experiences of various crime types listed in the survey (theft, robbery, violence, unlawful threat or hate crime, sexual harassment, and stalking). If the respondent checked a multiple of these boxes in the survey, the “Victim of crime” index would increase. The same procedure was repeated for all categories of questions (indices).

In order to measure how well the questions within the survey were related to each other, the Cronbach's alpha coefficient was calculated. Before creating these indices, the Cronbach's alpha coefficients were measured to ensure internal consistency. The Cronbach's alpha coefficient is used when evaluating a scale or group of related items meant to measure a specific construct or trait. It measures the degree to which the scale's items are correlated with one another, providing an estimate of how well the scale reflects the underlying construct. The Cronbach's alpha coefficient ranges from 0 to 1. The indices were compared using t-tests to explore differences in the responses between the two survey modes. T-tests are used to compare means between two groups to determine if there is a significant difference between them. Moreover, we also used chi-square analysis on variables related to crime and safety, for example, to examine whether there are significant variations in victimisation, fear or safety precautions taken by respondents based on the survey mode.

The effects of randomised alternative order on survey responses

To investigate the impact of the order of alternatives in multiple-choice questions (specifically the paper survey with fixed alternatives and the web-based survey with randomised alternatives), we examined the variation of ranking of answers based on three questions: (1) Factors affecting safety “Can you mark which of the following factors affect your safety at the station you normally travel from?” (Answers were composed of a set of 16 alternatives); (2) Safety precautions “Can you mark which of the following statements about safety/insecurity apply to you when you travel by train during the day or evening?” (Answers were composed of a set of 14 alternatives) and Recommendations “Can you mark which of the following could make your train journey safer?” (Answers were composed of a set of 16 alternatives).

In order to make the comparison between the two survey modes, we ranked the alternatives in order of magnitude, from the largest to the smallest, for each respective survey. Each alternative was then assigned a ranking number according to its perceived significance by the participants (Table 2). Then the ranking lists of the two survey modes were compared using Spearman’s rank correlation coefficient and Kendall’s tau-b (τb) correlation coefficient.

Table 2 Ranking of safety precautions from both survey modes

Results

Change in response rate during the progression of the surveys and their completion rates

Important indicators of participant engagement and survey effectiveness are response rates and completion rates in surveys. In Fig. 3, the graph shows the change in response rates as both types of surveys were being completed, as well as their respective completion rates and how it affects the further progress of the respondents through the survey.

Fig. 3
figure 3

Response rate for each question and the completion rate: Paper versus web survey

Starting with the response rate, a significant decline in the number of respondents was observed just after the first part of the web-based survey, with one-third dropping out after the first five questions (travel frequency, travel times, from which station they travel). In contrast, the paper survey saw a slight 8% decline during the same phase. About half of respondents completed the last question of the web-based survey, showing varying levels of participant engagement along the way (the average response rate for all questions was 60%). The response rates in the paper survey exhibit fluctuations, yet they generally remain consistently high, surpassing 90% for most questions (the average response rate for all questions was 93%). However, there is a notable exception in the question concerning the “age of respondents”, where there is a significant drop of nearly 35% (see the thick blue line).

Regarding the completion rates for each of the individual surveys (where every question was answered), we see that the results are very similar for both survey modes. The paper survey reached an overall competition rate of 39%, while the web survey achieved a slightly higher completion rate of 42%. If we exclude this outlier variable (age of respondents), the completion rate of the paper survey increased substantially to 54%, see the thin blue line in Figure 3. Table 3 shows the drop by survey mode from the beginning to the end of the survey, with and without “age of respondents”.

Table 3 Completed responses and percentage change from the initial count in each survey

Impact of survey mode on respondents’ answers

Tables 4 and 5 show the descriptive and reliability statistics of the indices in both surveys. The Cronbach's alpha coefficient is 0.76 which indicates high internal consistency among the items in the scale (here represented by the six indices), as indicated by DeVellis and Thorpe (2021). This means that people’s responses tend to be consistent with each other regardless of survey modes; in other words, the indices measure the same construct in a similar and reliable way. The results from the comparison of the two surveys are listed below.

Table 4 Descriptive statistics of the indices in both surveys
Table 5 Reliability statistics of the indices

We can observe that there are significant differences in the responses between the two survey modes for all the categories of examined questions (Table 6). Respondents in the web survey reported higher levels across all categories of indices compared to the survey group using a paper questionnaire. They had been victimised more, witnessed more crimes, had greater fear of crime, had more factors affecting their safety, took more safety precautions, and requested more improvements in the form of recommendations.

Table 6 Comparison of mean responses and t-test results

The analysis revealed a significant disparity between the two groups regarding self-reported victimisation experiences. Participants in the web group reported a notably higher level of victimisation, with an average score of 0.57 compared to the paper group’s average of 0.22. The t-test yielded a statistically significant result (t = 6.7, p < 0.001), indicating that the web group experienced higher victimisation rates. Similar to the victimisation findings, participants in the web group reported significantly higher levels of witnessing incidents compared to the paper group. The web group’s mean score was 1.40, while the paper group reported an average of 0.59. The t-test again demonstrated a highly significant difference (t = 8.8, p < 0.001).

The analysis of fear of crime produced one of the most pronounced disparities between the groups. The web group expressed significantly higher levels of fear, with an average score of 4.44, in contrast to the paper group’s average of 1.84. This difference was highly statistically significant (t = 18.9, p < 0.001), indicating that respondents from the web group were more apprehensive about crime. Participants in the web group also reported higher scores on factors affecting safety, with a mean of 4.68 compared to the paper group’s mean of 3.64. The t-test revealed a statistically significant difference (t = 5.3, p < 0.001), indicating that the web group perceived more factors affecting their safety.

A substantial difference emerged in responses related to precautions taken. The web group reported significantly higher precautionary measures, with a mean score of 8.16, whereas the paper group had an average of 3.87. Once again, this difference was highly significant (t = 13.9, p < 0.001), illustrating that respondents in the web group were more inclined to take precautions. Lastly, when it came to making recommendations related to safety and crime prevention, the web group scored higher on average (with a mean of 5.72) compared to the paper group (with a mean of 4.64). The t-test showed a statistically significant difference (t = 4.9, p < 0.001), indicating that the web group had more suggestions or ideas for improving safety.

To explore the differences in respondents’ answers more thoroughly, we also conducted Chi-Square tests on 100 variables related to crime and safety. Remarkably, 81 out of these 100 variables showed significant differences. Across all these cases, the percentage were consistently higher for the web-based survey group. This further implies that individuals in the web-based survey group reported higher levels of victimisation and felt more unsafe compared to their counterparts who participated in the paper-based survey. These findings highlight notable distinctions in experiences and perceptions related to crime and safety between the two survey modes. The full list of the variables compared with Chi-Square tests can be found in Appendix 2.

The effects of randomised alternative order on survey responses

Examining the correlation between the rankings of the answers from paper and web survey rankings concerning factors affecting their revealed significant alignment, Spearman's rank correlation coefficient demonstrated a strong positive correlation (n = 16, r = 0.88, p < 0.001), while Kendall’s tau_b correlation coefficient indicated a very strong correlation (n = 16, τb = 0.750, p < 0.001).

Analysing how safety precautions were ranked by the paper and web survey groups revealed robust correlations. Both Spearman’s rank correlation coefficient (n = 14, r = 0.97, p < 0.001) and Kendall’s tau_b correlation coefficient (n = 14, τb = 0.91, p < 0.001) demonstrated a very strong positive correlation. Finally, evaluating the ranking of recommendations from the paper and web survey groups also showed significant correlations. Spearman’s rank correlation coefficient (n = 16, r = 0.92, p < 0.001) and Kendall’s tau_b correlation coefficient (n = 16, τb = 0.77, p < 0.001) both indicated a very strong positive correlation. The results from all correlation analyses can be found in Table 7. The full ranking lists of the paper and web survey groups respectively can also be found in Appendix 3.

Table 7 Correlations between paper and web survey group rankings for three categories

Despite this difference in the presentation of alternatives, answers from the two survey formats were highly correlated. This suggests that survey respondents in both survey groups were able to provide consistent and comparable answers regardless of whether the answer options were presented in a fixed or random order.

Discussion of the results

The results indicate the surveys’ high internal consistency, which means that the indices measure the same construct in a similar and reliable way in both surveys. However, the study also reveals significant disparities in responses between web-based and paper-based survey formats across various dimensions related to transit safety in railway stations. In the web survey, respondents reported significantly higher levels of victimisation and fear than in the paper survey. These findings are consistent with prior research that found that participants tend to feel more comfortable answering questions about socially sensitive behaviours, victimisation, or perceptions of fear on a computer-based survey than on paper surveys (Davis Jr et al. 1992; Davis Jr and Morse 1991; Turner et al. 1998; van den Berg and Cillessen 2013). More interestingly, the greatest difference in reporting was found in the question about whether or not respondents take safety precautions, meaning what the respondents do to avoid risks and/or make them feel safer (indicated by alternatives showing respondents’ place/time avoidance, their changes in behaviour such as travelling in a group). Respondents were also more openly outspoken about their safety perceptions (fear of crime while at the station or in different transit environments) in the web survey than in the paper version of the survey. The high number of significant differences in crime and safety related questions (81 out of 100 variables) between the survey types further emphasises the impact of the mode on respondents’ answers after checking for sample characteristics. Social desirability could bias the answers in the paper-based survey, especially when anonymity is lower (Krumpal 2013). The web-based survey may also be subject to self-selection bias, which occurs when individuals choose whether to participate, leading to a sample that may not represent the broader population (Anderson et al. 2022; Bethlehem 2010; Khazaal et al. 2014). In this case it may be that people who have had particularly negative experiences or who are especially fearful of train travel are more likely to participate in the survey. This could skew the results, making it seem like fear and victimisation are more prevalent than they actually are among train travellers as a whole.

The web-based survey experienced a notable decline in respondents, particularly in the initial section of the survey, with approximately one-third dropping out after the first five questions, while the face-to-face paper survey showed a more gradual decline during the same part of the survey and then followed a more stable pattern of responses. The paper survey generally maintained high response rates to all questions throughout the survey (93% average response rate), whereas the response rate of the web survey gradually decreased as respondents progressed (60% average response rate). People who are doing the paper questionnaire might feel the need to be polite and quickly finish the survey, possibly skipping some parts. However, online survey takers might tend to give longer, more detailed answers, but may also feel a sense of fatigue or frustration, leading them to suddenly stop participating. This finding goes against the initial idea that the act of writing the answer on paper makes respondents less willing to finish the paper.

The comparable completion rates between paper and web surveys, with only a slight difference (39% for the paper survey and 42% for the web survey), indicate the effectiveness of both survey modes in ensuring participants answered all questions. Notably, the exclusion of the outlier variable (age of respondents) significantly increased the paper survey's completion rate to 54%. This finding underscores the importance of identifying and addressing specific variables that may impact the engagement of participants. It is not certain why the variable ‘age of respondents’ was avoided by the respondents of the paper survey. One possible explanation is that the mode of asking the question might have had an impact on the response rate. Respondents were invited to write down their age in the paper survey (How old are you?) instead of choosing from possible alternatives among age brackets (Which age group are you?), as it was done in the web survey. Thus, changes in the format in the variable (from open questions to multiple choice questions) seemed to have an impact on the willingness of respondents to answer this question (people are less willing to write down and reveal their exact age than to mark an age group in the multiple-choice question. If the variable ‘age of respondents’ is excluded from the calculation, the completion rate of the paper survey turns out to be higher than the web survey. Such findings are in line with previous research such as Bason (2000), Shih and Xitao (2008), to name just a few.

Examining the impact of randomised order of alternatives in multiple-choice questions also provided interesting findings. Despite the different ranking (fixed vs. automatic randomised order), the answers from both survey formats were significantly highly correlated, see, for instance, another example from Liu et al. (2024). This suggests that respondents were consistent in their choices, regardless of the order in which the alternatives were presented in the web or face-to-face paper survey. This finding challenges the conventional concern of order bias in survey responses and highlights the adaptability of participants to different question formats. If the ranking order was different, it could make it challenging to compare the results across these survey modes and to decide for instance which interventions one should take to deal with the problems of crime and safety in transit environments. However, the fact that the ranking order was highly significant for all three questions shows that safety experts and transit operators can single out these suggestions and make them a top priority for interventions.

As any study of this kind, this study is not free of limitations. Despite the fact we were able to select a comparable sample from the two samples, we have not included “frequency of use of railway system” as a criterion of selection, which may have implications. There might be a possibility that drops in the response rate might have occurred, in particular, in the web survey because the respondents do not frequently use the trains and stations. Another limitation is that we did not consider the fact that in one group the respondents were approached during the trip while in the other, they answered the survey in their homes or in some other environment. Thus, it is also possible that respondents who answered the survey on the platform were influenced by the presence of a person who “expects” the respondent to answer the survey. Moreover, although the order of response alternatives did not impact the pattern of responses, it is possible that, for example, posing the questions in another order could have influenced response patterns, which could be tested in future research.

Conclusions

This study sets out to assess the pattern of responses obtained by a web and a paper survey used to investigate the transit safety of travellers at railway stations in Sweden. Using statistical tests of different types, we showed that despite high internal consistency, significant disparities in responses between web-based and paper-based survey formats were found across various dimensions related to transit safety in railway stations. Web-based surveys reveal the fact that respondents were more open about their victimisation, fears, and precautions in comparison with those who answered the paper survey. There were also differences in the response and completion rate between the two survey modes; the paper survey did slightly better in both, but was not free of problems. Changes in the format in one variable (from open questions to multiple choice questions) seemed to have significantly affected responses in the paper survey, but not the order of the alternatives (from fixed to randomised alternatives), which is positive for the reliability and generalisability of the survey results. While the primary implications of this study’s results are for researchers, safety practitioners can benefit from evidence regarding improved survey design and greater confidence in these findings. One recommendation is that mixed-mode survey administration, combining different approaches, can compensate for the weaknesses of each method and potentially provide a more solid ground for safety interventions.