Questions used in this study
Test question: The question I use to investigate respondents’ recall ability and memory effects was adopted from Beierlein et al. (2012). It dealt with internal political efficacy and asked how well respondents understand and assess political issues. The test question was individually presented on the web survey page and had a vertically aligned response scale with the following five categories: agree strongly, agree, it depends, disagree, and disagree strongly.
Follow-up questions: To determine respondents’ recall ability and memory effects, I adopted some follow-up questions from the study of van Meurs and Saris (1990). Each question captures one of the three facets that constitute my definition of recall ability: stated recall, correct reproduction, and recall certainty. The first follow-up question, which captures stated recall, asked respondents whether they recall their previous answer using “yes” and “no” response categories. Depending on the answer to the first follow-up question, the second follow-up question, meant to capture correct reproduction, asked respondents either to indicate—if respondents said yes—or to estimate—if respondents said no—their previous answer. In line with the scope of this study, some respondents were asked to report their previous answer by selecting the respective category from the actual response scale and some others were asked to report their previous answer by entering the respective category in a text field (without displaying the response scale again). The third follow-up question, which captures recall certainty, asked respondents how confident they are about their reproduction using an eleven-point, end verbalized response scale running from “not at all certain” to “absolutely certain”.
The test question was placed at the beginning of the web survey and the follow-up questions were placed towards the end of the web survey to ensure a sufficient time interval. In total, there were 70 to 74 questions between the test question and the follow-up questions dealing with a variety of topics, such as achievement and job motivation and personality traits. Figure 1 illustrates the course of the web survey and Appendix A (see Table 7) shows the wording of the test and follow-up questions.
All questions were designed in German, which was the mother tongue of 96% of the respondents. To improve comparability between PCs and smartphones, I used an optimized survey layout that avoids horizontal scrolling.
I used a two-step split-ballot design with four experimental groups defined by device type (PC or smartphone) and response format (response scale or text field) for reporting the previous answer to the test question. This resulted in a 2-by-2 factorial design, as displayed in Table 2.
In the first step, respondents were randomly assigned to complete the web survey with a PC or a smartphone. This was done by the survey company before the start of the web survey. Then, in the web survey, respondents were randomly assigned to a response format for reporting their previous answer to the test question within device types. More specifically, respondents were asked to report their previous answer by selecting the respective category from the actual response scale or by entering the respective category in a text field (without displaying the response scale again). Appendix B (See Figs. 2, 3) contains screenshots of the two experimental versions of the follow-up question on correct reproduction for PCs and smartphones.
Procedure of the study
Data collection was conducted by the survey company Respondi (www.respondi.de) and took place in Germany in July and August 2019. Respondi drew a quota sample from their opt-in access panel based on age and gender, resulting in a 3 × 2 quota plan designed to represent the German population across these two demographic characteristics. The quotas were calculated based on the German Microcensus, which served as an official population benchmark. In total, Respondi invited 24,246 panelists to take part in the survey. Out of these, 4581 panelists were screened out because the quotas were already achieved or because these respondents tried to access the survey with a different device type than the one they were assigned. In total, 3407 panelists started the web survey. Among these, 108 dropped before being asked the study relevant questions. This leaves 3299 panelists available for statistical analyses.
The email invitation to the web survey included information on the estimated time of the web survey (approx. 20 min), the respective device type to use for survey completion, and a link to the survey. The first page of the survey outlined the general topic and the procedure of the survey and included a statement of confidentiality. Respondents received modest financial compensation from Respondi, which was proportional to the length of the survey.
Only respondents who owned both a PC and a smartphone were invited to take part in this study. To identify these respondents, I used the profiling information provided by the survey company. If respondents tried to enter the survey with a different device type than the one they were assigned, they were blocked from the survey and asked to switch to the correct device type. I also collected user-agent-strings informing about device properties, such as type and operating system.
In addition to user-agent-strings, I collected response times and window and browser tab switching. This was achieved by using the open source tool “Embedded Client Side Paradata” (ECSP) developed by Schlosser and Höhne (2018). Prior informed consent for the collection of paradata was obtained by Respondi as part of the respondents’ registration process.
In total, 3299 respondents participated in the study (1638 with a PC and 1661 with a smartphone). This corresponds to a participation rate of 14% among all invitees. The respondents were aged 18 to 70 years, with a mean age of 47 years, and 51% of them were male. In terms of education, 13% had completed lower secondary school (low education level), 35% intermediate secondary school (middle education level), and 53% college preparatory secondary school or university-level education (high education level).
To investigate RQ1a, I report the proportions of respondents’ stated recall regarding their previous answer to the test question (stated recall: 1 = yes). This is done between PC and smartphone respondents.
To investigate RQ1b on the correct reproduction, the answers of the respondents who were randomly assigned to enter the previously selected response category in a text field had to be coded before statistical analyses. Some respondents provided a correct, but misspelled answer (e.g., “it depent” instead of “it depends”). I decided to count such misspellings as correct reproduction because I am interested in respondents’ recall ability rather than their writing or typing skills. However, some other respondents entered a (slightly) different response category than the one they previously selected (e.g., “agree somewhat” instead of “agree” or “neither/nor” instead of “it depends”). I decided to count such discrepancies as incorrect reproduction.
With respect to RQ1b, I investigate the proportions of respondents’ correct reproduction. This measure is determined by comparing respondents’ answer to the test question and their answer to the second follow-up question (correct reproduction: 1 = yes). In line with the scope of the study and the experimental design, I now compare the proportions across device types (PC and smartphone) and response formats (response scale and text field).
Regarding RQ1c I investigate the means of respondents’ recall certainty, which was measured on an eleven-point, end verbalized response scale (recall certainty: 1 “not at all certain” to 11 “absolutely certain”). This is done between the four experimental groups defined by device type and response format.
In order to investigate RQ2a on stated recall and RQ2b on correct reproduction, I conduct two logistic regression analyses with stated recall and correct reproduction as binary dependent variables (see coding above), respectively. In contrast, to investigate RQ2c on recall certainty, I conduct an ordinary least squares (OLS) regression analysis with recall certainty as the dependent variable (see coding above). In line with the scope of this study and the experimental design, I use response scale (1 = yes) and PC (1 = yes) as the independent variables in all three regression analyses. I also control for several variables that previous research suggests to have an effect on stated recall, correct reproduction, and recall certainty (Rettig et al. 2020; Revilla and Höhne 2020; Schwarz et al. 2020; van Meurs and Saris 1990): extreme answer (1 = yes), response time (in seconds), in-between time (in seconds),Footnote 3 on-device media multitasking (1 = yes),Footnote 4 age (in years), education with high as reference: low (1 = yes) and middle (1 = yes), male (1 = yes), and survey participation (continuous).Footnote 5
In the logistic regression analysis on correct reproduction, I include stated recall as independent variable. Consequently, in the OLS regression analysis on recall certainty, I include stated recall and correct reproduction as independent variables (see coding above).
With respect to RQ3, I use the estimation procedure proposed by van Meurs and Saris (1990) to determine memory effects across device types and response formats. More specifically, I subtract the proportions of respondents stating no recall but correctly reproducing their previous answer from the proportions of respondents stating recall and correctly reproducing their previous answer. This estimation procedure allows me to distinguish between correct reproduction due to attitude stability and correct reproduction due to memory. In addition, it allows me to compare the size of the memory effects obtained in this study with those obtained in earlier studies (Rettig et al. 2020; Revilla and Höhne 2020; Schwarz et al. 2020; van Meurs and Saris 1990).