Abstract
Paradata are widely used in conjunction with surveys, from predicting behavior for targeted interventions, monitoring data quality and interviewer performance, to understanding and correcting biases in the data. We define survey paradata broadly: as nonsubstantive data that relate to the survey and its processes in at least one of three ways—they are produced by survey processes, describe them, or are used to manage and evaluate them. They typically would not exist without the survey. They may be automatically produced (e.g., keystrokes), actively collected (e.g., interviewer observations), or constructed later on (e.g., when a human labeler rates respondent–interviewer rapport by listening to recordings).
First, we review other data types (auxiliary, contextual, and metadata) because their overlaps with paradata can make it difficult to grasp paradata precisely. We discuss paradata definitions, including their weaknesses, arriving at our definition.
Second, we offer an overview of our field’s practice and literature: paradata examples, heterogeneity across paradata types and design options, applications, and challenges.
With paradata a somewhat mature concept in our field, survey methodology, we hope to provide a stimulating, broad introduction to practice and literature in our field, accessible to anyone irrespective of professional background. We hope that this chapter provides a valuable backdrop for the conceptualizations of paradata in other disciplines, as presented in this volume.
You have full access to this open access chapter, Download chapter PDF
1 Introduction
Surveys and Survey Methodology
A survey is a systematic, standardized data collection effort that proceeds mainly by asking questions and recording the responses of the so-called survey participants or respondents.Footnote 1 Sometimes, data are not collected about individuals but about, e.g., households or firms—yet, ultimately it is still humans providing answers. Surveys can be carried out by interviewers, be self-administered by the respondents, or take on a hybrid form in which interviewers are present but, after handing over a tablet or paper questionnaire, are inactive unless needed. More specifically, the survey ‘mode’ is about how a survey is conducted: interviews in person or by telephone, web-based surveys and mail questionnaires (both self-administered), or even multiple modes for one survey across regions or persons, or time.Footnote 2
Surveys have important benefits. They can cover a broad range of topics and can also probe for very detailed information. They can be tailored to the specific research questions for which a survey is conducted. Also, a person’s attitudes, past experiences, or other information not recorded anywhere may be best or even only available by asking them. Finally, surveys benefit from decades of methodological research (see below).Footnote 3 Thus, academic research, government agencies, public opinion research and polling, and the private sector will continue to rely on surveys.
This chapter looks at paradata specifically from the perspective of survey methodology. Most survey methodologists originally come from the social sciences, psychology, and statistics. However, the field has its own terminology, goals, challenges, and thinking. Thus, how we regard and use paradata is hopefully best understood against the background context we provide next.
Surveys are conducted to answer substantive research questions,Footnote 4 mainly through statistical analysis of the collected data. Survey methodology studies the design, execution, and monitoring of surveys as well as the statistical analysis of survey data (Groves et al., 2009, ch. 1.4): what are sources of problems, how can they be measured, and what are methods to address them? Besides the quality of data and of data analysis, costs are a consideration.Footnote 5 A ‘survey organization’ (the entity conducting the survey) needs to be profitable. Costs also relate quality, quantity, and scope of a survey as well as the number of surveys that can be afforded, thereby influencing the number of research questions that can be answered. Survey methodologists care about costs, besides due to these real-world constraints, also because they imply data quality trade-offs.
Survey methodologists often think about the quality of data and of data analysis along two dimensions: the survey participants (representation) and the responses (measurement). Low participation rates, an increasingly grave problem for our field, increase the costs to achieve a fixed number of respondents. Representation, meanwhile, is about differences between respondents and nonrespondents: the less the respondents are representative of the population about which a researcherFootnote 6 wishes to draw inferences, the more likely the data (analyses) are biased. Representation problems can occur at every step. First, there is rarely a complete list of all units in a population. The ‘sampling frame’ is the incomplete list that is available, from external sources or through construction, e.g., a ‘lister’ walking a neighborhood to collect addresses. ‘Coverage’ is about how representative the sampling frame is of the intended target population. Second, only a subset of the units from the sampling frame is actually selected for possible inclusion in the survey (‘sample’). Third, who from the selection ends up in the data is then determined by who is not successfully contacted and then by who decides to not participate (‘unit nonresponse’).
Imagine the eventual survey data as a rectangular, tabular data sheet: each row corresponds to exactly one respondent, and each column corresponds to exactly one survey question. Representation is then about how representative the rows are. Meanwhile, measurement considers each cell and asks whether and how the value in this cell deviates from the true value that one intended to capture.Footnote 7 Such errors can, again, occur at every step from planning to data analysis. First, researchers start from ideas or conceptions (‘constructs’) about the “elements of information” they seek (Groves et al., 2009, ch. 2.2.1, 2.3.1). Some constructs have decent, objective counterparts in the real world, but many are ‘latent’ (‘unobservable’, not directly measurable). ‘Validity’ is about the degree to which the concept in the researcher’s mind matches how respondents understand the corresponding survey question. The precise wording of a question, the question order, and other design factors can affect how participants interpret a question. Second, a response can deviate from the truth because of, e.g., recall error, low motivation, but also interviewer effects on, e.g., sensitive questions. A participant may also be unwilling or unable to respond to a particular survey ‘item’Footnote 8 at all: ‘item nonresponse’. Third, an initial or raw response may be processed or edited: the respondent might change their initial answer later or the interviewer might edit it. If only categorical answers are permitted, the initial response must also be mapped to the given categories. Sometimes, raw information is processed later on by ‘labelers’ (also called coders or annotators): e.g., coding open-text responses into categories or rating respondent behavior based on recordings.
Motivation
We discuss other data types and definitions of paradata in Sect. 2. In short, our own definition (see 2.2) is that paradata are data that are themselves not the survey’s substantive data, that would typically not exist without the particular survey, that can be automatically produced, actively collected, or constructed later on, and that relate to the survey and its processes in one of three ways: they are produced by the survey processes, often as by-products, they describe the survey processes, or they are used to manage and evaluate the survey process(es).
There are many different examples of survey paradata (Sect. 3), with much heterogeneity between them but also ample scope for designing them as needed (Sect. 4). The allure to survey methodologists comes from the goals and challenges laid out above: paradata can be employed to recognize problems in the survey data, to correct for them in the statistical analysis, and to monitor problems in near real-time or even to predict them which is the basis for interventions (Sect. 5). The hope is that paradata capture information about the processes that produce the survey data that would otherwise not be available. Some paradata types have been used in our field long before Couper (1998) first coined the term ‘paradata’.Footnote 9 As survey methodologists, we touch on direct uses of paradata in substantive research only very briefly. However, while paradata may help with problems in the substantive data, they themselves also face and pose challenges (Sects. 6 and 7). Still, the message of our broad overview is a positive one: there are low-cost paradata types that offer a great starting point.
2 Paradata and Other Data
In Sect. 2.1, we review four data types that appear in conjunction with ‘paradata’ in the literature. A key takeaway will be that there are overlaps between them and paradata, as simplified and visualized in Fig. 1. In Sect. 2.2, we discuss paradata: definitions, the relation to ‘process’ and process data, and our own, broad definition.
2.1 Substantive Data, Metadata, Auxiliary Data, Contextual Data
Substantive Data
4 are “what surveys are designed to collect or produce” (Couper, 2017b, p. 4): they correspond largely to the participants’ survey responses but also include, e.g., samples and measurements taken by respondents, interviewers, or sensors (Groves et al., 2009, ch. 2.2.2; Keusch et al., 2024). Unfortunately, ‘data’ are used as both a synonym for substantive data and an umbrella term for all data types (i.e., including substantive data, paradata, metadata, and so on).
Metadata
are, nowadays, “any descriptive information about some object(s) of interest” (NAS, 2022, p. 96). Thus, survey metadata are information about the survey, its components, and the produced data—“the core of [survey] documentation” (Kreuter, 2013, p. 3). Metadata variables are on a more macro level,Footnote 10exhibiting little variability (ibid): e.g., the survey’s response rate is one single value. This is illustrated by considering three important categories of metadata specific to surveys:Footnote 11
-
1.
Descriptions of the survey include its name, an outline of study goals, the survey mode, and the interviewer training handbook.
-
2.
Metadata on items, often in a codebook, encompass the names of the variables, possible values, interviewer instructions, and question wordings.
-
3.
Aggregated data and (statistical) summaries can come from aggregating paradata (yielding, e.g., the overall response rate) or aggregating substantive data (e.g., the share of female respondents).
We notice two problems, particularly in relation to paradata. First, there is no agreement on where metadata begin: i.e., how much describing, summarizing, or, crucially, aggregation turns microdata into metadata. Overlaps and inconsistencies are thus inevitable: e.g., information on an item is metadata (about that item), but, in relation to the whole survey, is also sometimes treated as paradata. Second, often in quick succession ‘data’ are introduced to mean ‘substantive data’, and then ‘metadata’ are defined as “data about data”,Footnote 12 implying that the second “data” in “data about data” solely refer to the substantive data—erroneously (see category 3 above and Kreuter, 2013, p. 3). In actuality, there are metadata on substantive data, metadata on paradata, and so on, although these distinctions are rarely made.Footnote 13
Auxiliary Data
Without a universally accepted definition, we follow Kreuter (2013, ch. 1.3): all data other than the substantive data, i.e., auxiliary data, include paradata.Footnote 14 ‘Auxiliary’ is to be taken literally: supplementary data that are meant to help.
Non-paradata auxiliary data are external to and (overwhelmingly) exist independent of the survey: e.g., administrative data on the same respondents, survey organization employee data on the interviewers, or Census data on area characteristics.
Enrichment with non-paradata auxiliary data can help substantive research (broader or more in-depth information on the very same respondents), reduce respondent burden (fewer questions necessary), and guide survey processes (adjusting contact protocols based on background from the sampling frame). Survey methodology research also benefits greatly: on factual questions, one can determine whether a response is correct by contrasting the survey response with the otherwise unknown true value provided by high-quality, ‘gold-standard’ auxiliary data on the very same individuals. One can then investigate the causes of erroneous answering and offer solutions (see Sect. 1).
Contextual Data
are any information about an event’s or an individual’s context, particularly social, physical, environmental, temporal, or informational context. This also includes information about relevant reference groups (e.g., the family) and abstract concepts (e.g., local social norms and the legal environment).Footnote 15 ‘Context’ goes beyond recording these aspects in isolation, also considering how they interact.
Two Helpful DistinctionsMicro context refers to a specific individual or event, whereas macro context is on a higher level (e.g., regional). Internal context is, e.g., someone’s emotional state, while external context includes local laws.
Subject of the ContextSubstantive context is what is predominantly meant by ‘context’ in the larger social science literature:Footnote 16 context pertaining to substantive research questions. For example, for survey respondents asked about their cannabis consumption, the substantive context includes their parents’ attitude and local legality. Survey scientists, however, also consider survey context: the context of conducting surveys and producing survey data, both in general and specific to a particular survey. Macro survey context can be, e.g., restrictions on freedom of expression in the survey’s locale. Such information would come from appropriate auxiliary data and be included in metadata—another example of overlap among data types.
We wish to emphasize the following overlap: most if not all micro survey context is part of paradata.Footnote 17 Micro context can influence behavior at the interview or response level and thus is part of the processes producing a survey’s data: e.g., how sensitive is a question to a particular respondent–interviewer pairing (Tourangeau & Yan, 2007, p. 860)? Much of micro context consists of such latent constructs. Thus, one has to rely on individuals’ self-reports, interviewers’ observations, and other proxy indicators. Further examples are provided in Sect. 3.
Figure 1 highlights some overarching results of Sect. 2.1. First, the data types do overlap. Second, the micro–macro consideration is useful but does not distinguish data types conclusively. Third, context in its various conceptions is part of all data types and of paradata in particular.
2.2 Paradata Definitions
Above, we discussed a first pitfall for grasping paradata: overlaps. A second challenge is that definitions in older, seminal works do not reflect the current understanding fully. There is also still no universally accepted definition (Couper, 2017b, p. 4). Third, paradata definitions in the literature even use different definitional bases. Some definitions even require two of them, source and content (e.g., West, 2011, p. 1 and McClain et al., 2019, p. 199). This is perhaps because of each base’s weaknesses. For each of the three definitional bases (italicized boldface), we give example definitions and discuss some weaknesses below.
Source Paradata are “captured during the process” (Kreuter, 2013, p. 3). Sometimes the by-product or automated nature is emphasized (e.g., Couper, 2000, p. 393 and Roßmann & Gummer, 2016, p. 313).
However, substantive data are also captured during the survey process, and some paradata variables are not but derived later (see Sect. 3). The by-product or automatic nature is missing from, e.g., interviewer observations.
Content Paradata are “describing” or “about” the process (Couper, 2000, p. 393; Nicolaas, 2011, p. 1).
Yet, some paradata are themselves not about any process directly: e.g., raw audio recordings, observed neighborhood characteristics, or whether respondent and interviewer have the same gender (as an aspect of their interaction).
Use Paradata are “used to manage and evaluate the survey process” (Couper, 2017b, p. 4f. on Groves & Heeringa, 2006).
However, sampling frame information and other auxiliary data are also used to “manage and evaluate the survey process”, and substantive research, too, employs paradata. Taken literally, absolutely nothing would be paradata unless and until it has been actually used to “manage and evaluate”.
Process
is a common refrain among paradata definitions. We find the singular ‘process’ misleading: it may be the reason why some equate the whole survey process with only the field phase, the data collection, or even only the interviewing process (see Couper, 2017b, p. 4, on paradata’s narrow origins). Thereby neglected processes include the design phase, postprocessing (editing, labeling, and coding), and two repeated processes: recruitment and the question–response process. The latter itself comprises comprehension, retrieval, judgment/estimation, and reporting processes (Groves et al., 2009, ch. 7.2). All these processes, and their complex relations, influence the survey as a whole and a specific cell’s value in the released substantive data.
From authors of other chapters, we learned that some of their fields struggle with how the terms ‘process data’ and ‘paradata’ relate exactly. In our field, there is near-universal agreement that all paradata are process data.Footnote 18 The reverse question is less settled: some disagree that all process data are also paradata (e.g., Lyberg, 2011, p. 8), whereas some agree although they often equate the terms just for their paper (e.g., Kreuter et al., 2010a, p. 282 and 286). The former do not provide counterexamples. Unfortunately, neither define ‘process data’.
We surmise that some processes, happening in temporal or spatial proximity to survey processes, produce process data, but not survey paradata:Footnote 19 e.g., internal processes such as human resources of the survey organization, or the processes of processing, analysis, and algorithmic decision-making (Enqvist 2024) on the released substantive data.
Our Definition
is to reflect the heterogeneity of what are and what is seen as paradata. It synthesizes existing definitions. Survey paradata are data
-
1.
that are themselves not the survey’s substantive data, and
-
2.
that would typically not exist without the particular survey, at least in the particular form available, and
-
3.
that were automatically produced, actively collected, or constructed later on, and
-
4.
that relate to the survey and its processes in at least one of three ways:
-
a.
Data produced by the survey processes, often as by-products
-
b.
Data describing the survey processes, including proxies for unobserved constructs and (micro-)contextual information about the survey processes
-
c.
Data used to manage and evaluate the survey process(es).
-
a.
3 Paradata Examples
Within each category (boldface), we usually first present primary, raw paradata and then some ‘derived variables’, i.e., created from the former or other data sources. The categories are to facilitate understanding. They partly overlap.
Timing
is first captured as time stamps (time and date) from which much can be derived: on which day of the week is the interview, is it a holiday, or how much time has passed since the start of the field phase, the last interview, etc. Response times are how long it takes a respondent(-interviewer paring) to complete a specific item in a particular survey (Matjašič et al., 2018); these times add up to the interview duration.
Call Records
are kept about prior contact attempts for each sampled unit. Note that survey scientists call contact attempts ‘calls’, regardless of survey mode. Together with each call’s outcomes (disposition codes: noncontact, rescheduled, completed, …; reasons for refusal), they are also termed contact history data. Recruitment phase data are the web survey analogue (McClain et al., 2019, p. 200f.). Much information can be derived: e.g., a unit’s current status; level-of-effort measures (Olson, 2006, p. 744f.); contact sequences (Durrant et al., 2019); and response histories in panelsFootnote 20 (Kreuter & Jäckle, 2008).
Audio, Verbal, or Voice Paradata
comprise recordings and features automatically extracted in real time. Derivable variables include pitch, speed, disfluencies—particularly their levels, changes, and the respondent–interviewer similarity; overspeech; and whether a question was misread by the interviewer (Jans, 2010; Conrad et al., 2013; Olson & Parkhurst, 2013, ch. 3.3.5).
Location Paradata
can come from, e.g., GPS (Edwards et al., 2017, ch. 12.3), other devices (Keusch et al., 2024), or IP addresses (Felderer & Blom, 2022). Interviewer travel distance and patterns or whether the respondent was on the move during the interview are examples of dynamics that can be derived.
Device Paradata
mainly concern web surveys: e.g., device type (PC, smartphone, tablet), operating system, and browser settings (Callegaro et al., 2015, ch. 2.4.2.2).
Human Interface/Input Device Paradata
mostly come in two forms: Keystroke data log each key pressed by the interviewer and the respondent. For example, sequences and how often the help/back/delete keys were pressed can be derived. Mouse tracking captures a computer mouse’s movements and clicks, yielding timed sequences of coordinates and events. They allow the calculation of distance traveled by the mouse cursor, deviation from the direct path, velocity, and acceleration, as well as hovers over response options (Kieslich et al., 2019; Fernández-Fontelo et al., 2023). Both forms inform about navigation, idle times, and whether and when responses were changed and what the previous answer was. Analogues for smartphones and tablets have been developed (Schlosser & Höhne, 2020).
Interviewer Observations
can be about, e.g., the neighborhood (signs of vandalism), dwelling (the presence of children or whether interviewer access was blocked by a gate), person, or interview (interruptions by children). Interviewer ratings are evaluations of, e.g., the respondent’s interest, effort, or satisfaction, and the interviewer–respondent interaction (Kirchner et al., 2017; Jacobs et al., 2020).
Respondent’s Ratings and Self-Ratings
mirror interviewer ratings. Respondents are either explicitly prompted for their ratings or can provide information about their particular survey in an ‘open comments’ section at the end.
Interviewer Characteristics
can be either fixed or varying. The former are sometimes seen as paradata: sociodemographics, position or experience in the organization, etc. (via employee data); or attitudes, traits, education, skills, and years working as an interviewer (via interviewers answering a separate questionnaire). Varying characteristics need to be calculated: the number of prior calls or completed interviews on the same day or during the field phase overall, time since the last interview, etc.
Few fixed respondent characteristics are widely considered paradata, except for some attitudes (about being interviewed, scientific surveys generally, this survey’s topic) and prior survey experience (Matthijsse et al., 2015; Schwarz et al., 2022, ch. 2). Varying respondent characteristics are discussed under Interactions below.
Survey/Interview Characteristics such as incentives, recruitment strategies, and offered mode can vary: across units, time, or in multicountry efforts.Footnote 21
Item Characteristics inevitably differ between items: e.g., length, response options, and topic. But even a particular item may be different across respondents: e.g., because of adaptations based on the participant’s prior responses.
Interactions
are often only accessible via proxies or subjective judgments. For each respondent–interviewer pairing, ratings can be captured, the difference in age, attitudes, or language can be derived, or specific aspects of this complex social interaction can be addressed (Bradburn, 2016). For the respondent-survey interaction, reasons for participating (Schwarz et al., 2022, ch. 2.2) or whether it was conducted in the respondent’s native tongue are examples. The respondent-item interaction, too, contains subjective aspects, e.g., item sensitivity or trouble understanding, and objective aspects, e.g., the number of all or of similar questions answered before.
Micro Survey Context
partly overlaps with observations, ratings, and interactions. Adding to Sect. 2.1, we highlight some further latent constructs (italicized) and respective proxies. Perceived level of privacy (Yan, 2021, p. 120): Was the respondent at home or in a public space? Who else was present: a boss, spouse, or children? Trust and interviewer–respondent rapport (Sun et al., 2021): Was it always the same interviewer, in panels or in continued recruitment attempts (Kühne, 2018)? Engagement and effort: Did the respondent multitask? Did they look up information in documents?
Design Phase
Footnote 22 paradata include changes made after pretesting a survey (see Sect. 5). Online comments are a simple tool for volunteering information: e.g., respondents about comprehensibility, offensiveness, or other issues with a question, or experts about design flaws (Callegaro et al., 2015, p. 105, 109).
Editing/Coding22 paradata about each cell of the substantive data can be whether the value came from the respondent or a labeler (Sana & Weinreb, 2008). More detailed information for the latter case includes the labeler id or the rate of agreement among multiple labelers looking at the same cell.
Miscellaneous
Video recordings, eye-tracking measures, and brain activity data are rare because of equipment requirements (Callegaro et al., 2015, p. 108f.).
Some surveys inquire about providing biosamples, willingness to be contacted again, or allowing record linkage to other data. The respondent’s consent decision or reasons for refusal (Sakshaug, 2013) may be indicative of respondent behavior.
The status or relation of who provided information can be relative to the sample unit (targeted person versus family member) or to the information (information provided about oneself or about someone else). In establishment surveys, the respondent’s position within the company might influence their response.
Sensors, wearables, and apps have received attention recently (Keusch et al., 2024), but only a part of these data are paradata.
4 Collecting, Structuring, and Designing Paradata
Below, we consider some differences in how paradata types are collected and structured. This is not just inherent heterogeneity: many paradata types can also be actively designed. Thus, paradata are not necessarily ‘found’ or ‘organic data’ (Groves, 2011; Japec et al., 2015, p. 843) over which researchers have no discretion.
Resolution
Device-recorded paradata are usually constrained only theoretically, as technical resolution limits are beyond sufficient.Footnote 23 Resolution for response time should be at least 100, better 10 or 1 millisecond(s) (Mayerl, 2013, p. 3): if differences across individuals (‘signal’) are on the order of seconds, then measuring only up to the second (‘noise’) degrades too much information (signal-to-noise ratio)—needlessly.
Human-recorded paradata should heed survey methodology’s advice on how to construct items and response scales (e.g., Bradburn et al., 2004; Groves et al., 2009, ch. 7).Footnote 24 Here, higher resolution can be detrimental: e.g., contact history systems with very many disposition codes (AAPOR, 2016, p. 71ff.) may produce minuscule counts for some outcomes and errors (e.g., by inexperienced interviewers). If one anticipates combining categories later on, then the design should facilitate this.
Granularity
Ratings may reflect the whole interview, segments, or use, e.g., ‘increasing/steady/declining’ to capture dynamics. Response time can be at the level of questionnaire sections, the page (web mode), item, or even finer (see Components). Item-level analysis is impossible when measurements are only at the page level.
In web surveys, server-side (Callegaro, 2013, p. 262) measurements can always be implemented but are restricted to the page level and (differential) transmission, and loading times are unwanted components—in contrast to client-side measurements (i.e., collected on respondents’ devices) whose collectibility, however, depends on user consent and device (Callegaro et al., 2015, ch. 5.3.4.1).
Components: Splitting/Combining
The idealized question–answer sequence comprises an interviewer reading aloud, the respondent’s cognitive processing, their answering, and the entering of the response. Thus, item-level response time can be split into four components, yielding more nuanced information.
Aggregation This often refers to the level at which a variable operates (varies) within the complex, hierarchical survey structure: item, respondent, interviewer, call, or interview. Appropriate aggregation reduces informational overload (see Sect. 6) and enables targeted applications: e.g., interviewer level monitoring and item level design evaluation (see Sect. 5).
Ex post, one can usually decrease granularity, combine components,Footnote 25 and aggregate. The reverse direction is typically impossible: information not captured is lost forever.
Degree of Automation
Some paradata are always collected automatically (e.g., keystrokes) and some are never (e.g., interviewer observations). For others, such as response times, there are several options:
-
1.
Manual, ‘active’ time stamps: The interviewer presses a button after having read the question aloud completely and again when the participant starts responding.
-
2.
General automation: The timer always starts when the question appears on-screen and stops when the response is confirmed.
-
3.
Specific automation: A voice-activated system recognizes when speaking starts and stops.
Each approach has its advantages. Interviewers (1) know best when they finish and can also ignore nonanswers (e.g., thinking out loud or asking for clarification); they can also record whether a measurement was valid. Meanwhile, automation frees up the interviewer. General automation (2) is unsusceptible to inadvertent button-pressing and nonanswering but combines all interviewer and respondent components into one measurement. Specific automation (3) can separate them but is hampered by overspeech, low volume, nonanswering, and needs specialized software. Combined, semi-automated versions try to reap the benefits of each approach.
Raw Paradata
are sometimes not fit for the intended use.
Preprocessing turns raw mouse-tracking data—a continuous stream of events, coded in computer language—into comprehensible, usable information (Olson & Parkhurst, 2013, ch. 3.3.3, 3.5.1). Specialized software exists (Wulff et al., 2021; Henninger et al., 2022b), unlike for processing tasks needing human labelers such as rating recorded interactions. Other paradata may need trivial (response time \(=\) stoptime \(-\) starttime) or no work (interviewer ratings).
Adjustment shall denote the correction for unwanted properties. Raw response times are influenced by characteristics of the respondent (e.g., their general baseline speed; Mayerl et al., 2005), interviewer, item, device, and so on (Couper & Peterson, 2017; Sturgis et al., 2021, ch. 1). Response times become comparable only after accounting for such influences; otherwise, it would remain unclear if someone was speeding on an item or is just generally fast. This is usually done by statistical regression (Couper & Kreuter, 2013), which is only possible after data collection, i.e., not in real time. Further examples benefiting from adjusting for respondent idiosyncrasies include verbal and mouse paradata.
Capturing Paradata
can take on different, potentially complementary forms. Location is, mostly, about server-side and client-side in web surveys (see Granularity). The origin of ratings can be from interviewers, respondents, or labelers, of observations from interviewers, recruiters, or listers, and of response times from human interviewers or several devices.
Availability
for all relevant units and at a point in time limits applications (see Sect. 5).
When While interviewers continually observe the process, their evaluations are only recorded after the interview’s conclusion. Other paradata are available in (near) real time—even some derived variables: e.g., idle times and response editing, from keystrokes or mouse tracking. However, a need for nonautomated or time-consuming preprocessing or adjustments impedes real-time interventions (Mittereder, 2019, p. 153). On a different note, early in the field phase the paucity of paradata limits applications (West et al., 2023).
On whom There are more data for completed cases (e.g., ratings) than for breakoffs (anything collected until termination), contacts (outcome codes; some interviewer observations), and noncontacts (GPS; neighborhood observations) (Sakshaug & Kreuter, 2011).
5 Applications
Paradata are employed for the various survey methodological challenges (see Sect. 1): errors of representation, errors of measurement, and missing data, i.e., missing units/rows (‘unit nonresponse’) and missing responses/cell entries (‘item nonresponse’). The first goal is to recognize errors and the underlying mechanisms. Then, the design of future surveys can be improved. Also, for given survey data, statistical methods such as imputation and weighting can be used to derive unbiased results even from deficient data. The second goal is to monitor data quality and to predict problems: this is the basis for interventions.
Any statistical modeling of behavior is constrained by what paradata are available for all relevant units. Prediction, in addition, is restricted by information available at prediction time (see 4).Footnote 26
Unit Nonresponse
is likely the most studied application. Other than the (very limited) sampling frame information, paradata may be all that is available for respondents and nonrespondents alike (Sinibaldi et al., 2014):Footnote 27 e.g., observations about the neighborhood, dwelling, or individual(s), call histories, and interviewer characteristics such as their voice (Kreuter & Casas-Cordero, 2010, p. 3; Olson, 2013; Charoenruk & Olson, 2018).Footnote 28
Avoidance of nonresponse bias builds on increasing the recruitment effort, monetary incentives (Jackson et al., 2020), or the many adaptive survey design strategies (see below) on cases predicted to be difficult or important for sample representativity.
Adjustment for nonresponse in the eventual data analysis of the substantive data usually involves some form of weighting based on the response propensity P. To repair nonresponse bias, a paradata variable employed in estimating P must be strongly correlated with both P and the survey variable of interest (Kreuter et al., 2010b). However, rarely a single available paradata variable exhibits enough correlation with both; using multiple variables may help (Kreuter & Olson, 2011).
Panel dropout of participants between waves of a panel survey can be studied with prior waves’ paradata: comments (McLauchlan & Schonlau, 2016); response behavior and speed (Roßmann & Gummer, 2016); interviewer observations (Plewis et al., 2017); and habitual late responding (Minderop & Weiß, 2023). Breakoffs are more frequent in web mode, on mobile (versus PC) and nonpreferred devices, and preceding response behavior such as speeding and instability is predictive (Mittereder, 2019; Couper et al., 2017; Chen et al., 2022; Mittereder & West, 2022).
Coverage Error
can be addressed in two ways (Eckman, 2013). First, two sources can be compared. For example, sampling frame versus a ‘lister’ walking a neighborhood to collect addresses and contact information: flagging, additions, and deletions of units. How much self-reports, sampling frame information, or interviewer observations match on survey inclusion criteria is an indicator of their accuracy. Second, whether the particular circumstances affected sampling frame creation is of interest: e.g., duration, weather, time, location data, lister’s or interviewer’s discretion, and edits. Device paradata can inform about the error that would be introduced when survey participation required apps available only for some smartphone models (Couper et al., 2017, ch. 7.2). Similarly, to increase representativeness, some online panels have offered free devices and Internet to those lacking (Blom et al., 2017). Paradata on who was such an ‘offliner’ allow studying such programs’ success regarding participation and improving substantive results (Cornesse & Schaurer, 2021; Eckman, 2016).
Errors of Measurement
(Yan & Olson, 2013) and Item Nonresponse are often studied jointly as both they concern (error-prone and missing, respectively) cell entries in the substantive data.
Paradata measure or proxy behaviors, context, and mechanisms that influence these data quality aspects: device (Lugtig & Toepoel, 2016); multitasking (Sendelbah et al., 2016; Höhne et al., 2020b); regional context (Purdam et al., 2020); rapport (Sun et al., 2021); consistency of related answers (Revilla & Ochoa, 2015); uncertainty, slow or fast responding, changing of responses, soliciting help, interviewers misreading (Yan & Olson, 2013); ratings (Holbrook et al., 2014; Olson & Parkhurst, 2013, ch. 3.3.6); verbal paradata (Jans, 2010); reasons for participating such as incentives (Matthijsse et al., 2015; Schwarz et al., 2022); and interviewer characteristics influencing the sensitivity of a specific question (Peytchev, 2012). Respondent self-reports provide additional information to that already contained in other paradata (Revilla & Ochoa, 2015; Höhne et al., 2020a).
Adaptive Survey Design
(ASD) and Responsive Survey Design (RSD)Footnote 29 are popular, mostly to lower costs and increase data quality (e.g., Wagner et al., 2012). One perspective is that the harder a unit is to recruit, the more similar it presumably is to nonrespondents (Olson, 2013, p. 155). Thus, when data collection stabilizes, i.e., primary substantive variables do not change anymore with increased contact attempts, one may move to another RSD phase, tweak protocols, or stop data collection. Real-time interventions in ASD can be appropriate pop-up messages to prevent breakoffs (propensity predicted with paradata: Mittereder, 2019, ch. 6) or slow down speeders (Conrad et al., 2017). Offering clarifications in self-administered surveys based on age-adjusted idle time can improve response accuracy and satisfaction (Conrad et al., 2007). Allowing the interviewer to ask only the most important questions when they predict a high risk for unit nonresponse or breakoff is more drastic (Lynn, 2003).
Monitoring and Evaluation
guide the complex survey processes in real time (Couper, 2017b, p. 10). Dashboards (Mohadjer & Edwards, 2018, p. 263ff.) visualize information for survey managers: in particular, ‘key performance indicators’ of costs, data quality,Footnote 30 and interviewer performanceFootnote 31 (Meitinger et al., 2020).
Performance can improve with feedback to interviewers when data sources conflict (GPS vs. call history about locations: Edwards et al., 2017, ch. 12.3; Wagner et al., 2017, p. 221) or when paradata indicate deviations from protocols (Edwards et al., 2020). Recordings can be reviewed for quality control. Interview durations may inform about deviant behavior (fabricated interviews: Schwanhäuser et al., 2022).
Evaluation of Survey Design
in the evaluation phase (Maitland & Presser, 2018), in pretesting (Couper, 2000; Stern, 2008), by experts (comments about items: Callegaro et al., 2015, p. 105, 109), and during the field phase uses paradata to indicate problems: slow responding, rates of item nonresponse and changed responses, going back to earlier related items, interviewer evaluations, and labeler-coded behavior.
Costs
not being available in real time or in full detail hampers survey administration. Then, estimating cost parameters from call histories may help (Wagner, 2019).
Substantive Research
has used survey paradata, too, but is beyond the focus of this survey methodological chapter. For example, interviewer observations can supplement the substantive data when missing or for quality control: e.g., the presence of wheelchair ramps and cigarette butts in health surveys (West, 2018a, p. 212).
Response times or ‘latencies’ have long been used to study cognitive processes such as the degree of elaboration (deliberative-controlled or automatic-spontaneous processing), abilities, strength of attitudes, and mental availability of information (Johnson, 2004; Mayerl, 2013; Kyllonen & Zu, 2016; De Boeck & Jeon, 2019).
6 Challenges and Some Solutions
Paradata Quality
is understudied in general (West & Sinibaldi, 2013). Interviewer-produced paradata have received relatively more attention (Olson, 2013, p. 159). Automation (see 4) and objective paradata (West & Sinibaldi, 2013, ch. 14.2.3) do not guarantee high(er) quality.
Errors
,including a lack of internal validity or reliability, in interviewer observations have been noted at rates between \(<\)10% and 92% (West, 2011, p. 4; West, 2013b). Context (e.g., seasonality, cooperation, sensitivity) and characteristics of the respondent, household, area, and interviewer can influence interviewer observations (West & Li, 2019; West & Blom, 2017, ch. 4.8). Unfortunately, performance may actually decline during the field phase (West & Sinibaldi, 2013, p. 351). Also, interlabeler reliability can be challenging (verbal paradata: Jans, 2010, ch. 2.2).
Missing Values
can be frequent in, e.g., interviewers’ neighborhood observations (Olson, 2013, p. 146) and call records (Wagner et al., 2017, ch. 5.3). Reasons include ambiguous guidelines or cases (Biemer et al., 2013), forgetting when recording later,Footnote 32 and hesitancy to record sensitive information. Also, using multiple devices can hinder completeness (Höhne et al., 2020a, p. 994).
Solutions
for improving general survey quality are also informative for survey paradata quality. For instance, operationalizations of interviewer observations should heed survey methodology’s general lessons better (see 4 and Kreuter, 2018b, p. 534). Systems must facilitate easy, timely entry (ibid ), with errors easy to correct or flag: e.g., interviewers can rate each response time measurement as valid, respondent error, or interviewer error (Mayerl, 2013, step 2.2). Automatic consistency and completeness checks (West & Sinibaldi, 2013, p. 352f.) can compare across data sources (for location: from GPS vs. from call records) or to normal values (unusual response times: West, 2013a, p. 352f.): after all, interviewers can prevent or correct problems best in real time. Frequently recommended are standardized, high-quality, survey-specific training and, periodically or when needed, retraining of interviewers, reminders, checklists, instructions, and the like (e.g., Kreuter, 2018b, p. 534).
Informed Consent
about paradata collection is an ongoing debate (Connors et al., 2019, p. 187f.). Should respondents be informed—and how?Footnote 33 Do they need to consent (Kunz et al., 2020a, p. 397f.)—at all,Footnote 34 as part of one overall agreement to participate in the survey, or in a separate paradata consent question?Footnote 35 Nonconsent may reduce participation (Couper & Singer, 2013) and bias samples (Felderer & Blom, 2022, p. 878), although much less so when following the emerging best practices (Kunz et al., 2020b). We would like to caution that in the long run a lack of transparency could backfire and reduce trust and participation rates.
Confidentiality
concerns are highest in, e.g., address details, interviewer observations, open-text answers, and recordings (Nicolaas, 2011, p. 15). Selective anonymization is hard for unstructured paradata (Kreuter, 2018b, p. 535). The general approaches for sensitive data (see Shlomo, 2018 and Bender et al., 2020) can be solutions for paradata, too. Also, paradata may be used in real time, never leaving a respondent’s device (Henninger et al., 2022a, p. 16). Also, perceived privacy (Nicolaas, 2011, p. 15), the actual driver of consent and behavior, must reflect reality.
Availability of Paradata
is hampered by organizations guarding internal best practices, resources needed for preparation, warehousing, and documentation (Nicolaas, 2011, p. 16 and 14; Olson, 2013, p. 162), and confidentiality questions (Kreuter, 2018b, p. 535). (Micro) Paradata are released more frequently nowadays but often only contain some of the paradata variables or only the completed interviews. Research about paradata may also stay internal for similar reasons or because improvements are deemed small (Wagner, 2013b, p. 166).
Standardization
may be helped indirectly by the dominance of a few software solutions (web: McClain et al., 2019, p. 201f.).Footnote 36 Yet, in contrast to metadata there are almost no universal paradata standards (Vardigan et al., 2016, p. 445; Couper, 2017b, p. 7). Even within an organization there may be heterogeneity on how to record information: e.g., among interviewers with different experiences at prior employers or between survey methodologists and interviewers. Concrete, clear standards are key. Yet, standardization must leave room for tailoring paradata (Kreuter, 2018b, p. 534): e.g., to specific contexts and needs (see West & Sinibaldi, 2013, p. 347 and 5 on nonresponse adjustment variables having to fit the specific application).
Overwhelming
users is a common worry about paradata (Couper, 1998, p. 45; Kreuter et al., 2010a, ch. 5). This is in part, but not only, about volume.
The informational content per observation is, however, only high for some variables: e.g., an interviewer’s exhaustive free-text call notes may be useful to themselves but overwhelm other follow-up interviewers or managers (West & Sinibaldi, 2013, p. 347). Standardizing and structuring the minimum informational content while making additional notes optional is an easy fix. Many paradata variables are or might be available. Beginning with those that one knows will be used and for which one has applications is a great starting point (West, 2018a, p. 213). Some paradata variables have many data points: e.g., every single mouse coordinate.
Instead of appraising every single, microlevel value, information is aggregated to the appropriate level, reduced in dimension (e.g., by clustering) or to special cases (e.g., outliers), or fed into statistical methods.
Handling Paradata
can seem daunting at first. Yet, the separate files for call records, interviewer characteristics, and item-level paradata can be merged. Levels may be changed by aggregation, or, in files, by ‘reshaping’ between long and wide data formats. All this is facilitated by software and need not be done manually.
The structure of many paradata variables can be nontrivial. Where detailed statistical analysis of paradata is needed, hierarchical, complex structures are addressed with multilevel modeling.Footnote 37 Call records are an example of unbalanced data: zero, one, or more observations per unit. Yet, this is only sometimes actually problematic. Then, simple aggregation is often sufficient: e.g., counts per unit. There are also less crude methods that can target patterns as a whole, e.g., in call histories and mouse movement trajectories (Durrant et al., 2019; Fernández-Fontelo et al., 2023).
Heterogeneity
abounds across cases. Some accrue more information (completed interviews) or more observations (repeated calls). One variable may capture different concepts (Olson, 2013, p. 159): attempts made (nonrespondents) or calls needed for success (respondents). In surveys with multiple modes (e.g., ASD and RSD), some variables are not available in each or not directly comparable (Kreuter, 2018a, p. 195).
Information Is Lacking
on many processes (respondents’ and interviewers’ true motivation, states, and behavior) or because of too few cases (e.g., breakoffs and fabricated interviews). Unsupervised learning (James et al., 2021, ch. 12) may help: e.g., clustering for finding deviant interviewer behavior (Schwanhäuser et al., 2022).
Misalignment of Incentives
between, e.g., interviewers and survey designers or researchers, can be problematic. Yet, studies of, e.g., prevalence and reasons for interviewers ignoring recommendations are rare (call timing: Wagner, 2013a, Experiment 5; travel routes: Tourangeau, 2021, p. 17f.). Remuneration schemes ignoring the time needed to record paradataFootnote 38 clash with expectations for high-quality paradata.Footnote 39 With (perhaps diffuse) monitoring, interviewers may feel the need to demonstrate performance (West & Sinibaldi, 2013, p. 343, 347). Transparency is a partial solution (West & Groves, 2013, p. 373): letting the interviewer know why they get relatively more difficult cases and that good paradata help fair evaluation.
Overall, one may need to convince the interviewers of the value of quality paradata, in general and to themselves, via improved case assignments and improved recommendations (West & Sinibaldi, 2013, p. 348; West, 2018a, p. 212). The same is true for survey managers, listers, recruiters, and other actors on the ground or in decision-making positions (Olson, 2013, p. 161).
7 Discussion
(Un)intended Consequences
of making paradata and paradata collection explicit need further study. Changed behavior among “watched” respondents is plausible but has not been found yet (Kunz et al., 2020a, p. 402) except for participation (Henninger et al., 2022a, p. 5f., 9). When recorded, interviewers produce fewer suspiciously short durations (Olbrich et al., 2022). On a different note, making interviewers predict respondent behavior could yield self-fulfilling prophecies (Eckman, 2017, ch. 3).
Perspectives
on paradata are many and varied. This is true across disciplines, as this volume shows, but also within our field. Most research has started from either the available paradata or established knowledge about surveys. Those on the ground—labelers, field staff, interviewers (Jans, 2010, ch. 2.2; West & Sinibaldi, 2013, ch. 14.2.2.1; West & Trappmann, 2019)— have hitherto untapped knowledge about processes, their own strategies, and working with researchers’ paradata instruments.
Ethics
and critical reflection of potential harm from paradata collection and applications are paramount (AAPOR, 2021). Survey methodology is shaped by mostly benign surveys. In the West or elsewhere, respondents and interviewers from some locales, contexts, or specific groups are rightfully afraid of negative consequences from honest answering or mere participation. Yet, many of the ethical and legal struggles (see also Sect. 6) are not unique to paradata (Conrad et al., 2021, p. 254).
Costs and Trade-Offs
relate questions of data quality to each other and to the real world. Paradata may be by-products—they are not why surveys are conducted—but they are not cost-free: Systems need development and maintenance; recording information (interviewers), monitoring quality (managers), and training (both) take time and effort; paradata must be preprocessed and documented before being released. That paradata are high-quality is not a given, either (see Sect. 6).
Our field does not have a common framework for all survey costs and few empirical studies on utility per dollar. Trade-offs are recognized but hard to quantify. Resources spent on paradata basics (e.g., infrastructure) cannot be spent to improve one survey’s substantive data (quantity or quality) but can benefit many future surveys.
We have discussed many examples and challenges to provide a broad overview, but one important message should not get lost: some paradata types are easy to capture and contain much information relative to the resources that must be invested.
Take a Paradata Perspective When Helpful
Whether everyone agrees that something is paradata or whether they would, had it been created differently, will not diminish its usefulness. Paradata are not an end unto themselves, but “additional […] tools” to help in practice (Couper, 2017b, p. 11), not meant to replace other tools or perspectives. Use may not seem the most important definitional base for paradata (see Sect. 2.2), but, after all, applications are why we capture paradata.
Notes
- 1.
Our introduction relies on Groves et al. (2009). Note that surveys are distinct from the less structured qualitative methods that also use interviewing: of experts, focus groups, etc.
- 2.
In this chapter, we will repeatedly refer to the interviewers. Such statements, of course, only apply to survey modes that feature them. We omit this constant reminder.
- 3.
New data sources such as Big Data have gained prominence due to their often low cost and large volume. However, they rarely offer the breadth and level of detail of a survey, researchers typically have little influence over and information about what and how data are captured, and data quality can be very problematic. Survey data and these other data sources can, of course, be complements (Japec et al., 2015, p. 873 and Couper, 2017a, p. 134f.) and help to improve one another’s methodology (Hill et al., 2021). Surveys are here to stay.
- 4.
‘Substantive’ in the sense of the substantive/empirical sciences (studying real-world phenomena), as opposed to the methodological/formal sciences (doing methods research).
- 5.
The other, “fitness-for-use” quality dimensions of surveys are important per se but are not at the forefront of survey methodology: relevance, credibility, and timeliness (Groves et al., 2009, ch. 2.6).
- 6.
We use ‘researchers’ as a shorthand. Their substantive research questions are to be answered by the eventual analysis of the survey data, and they thus also influence what the survey is about.
- 7.
One may also look at a column as a whole: is there a difference between what this survey question should capture and the values that one got?
- 8.
This is the more general and technical term than ‘question’ as surveys also contain, e.g., prompts that are not phrased as questions.
- 9.
See Couper (2017b, p. 6 and 11) who also reflects on the evolution of the concept ‘paradata’.
- 10.
We use ‘macro’ to denote aggregate/higher level and ‘micro’ for individual/low level (e.g., the level of the contact attempt). Just as microeconomics considers individual consumers, firms, etc., while macroeconomics studies countries as a whole (comprised of the former).
- 11.
- 12.
This original, historical definition (NAS, 2022, p. 95) also, in the first “data” (instead of using ‘information’), hides the pronounced heterogeneity among metadata of which the three categories above provide only a glimpse.
- 13.
- 14.
- 15.
- 16.
Also, often only external and macro context is considered—without even addressing these restrictions. Terms like ‘environment’ and ‘surrounding’ may unnotedly induce too narrow notions of ‘context’.
- 17.
This was perhaps first articulated very explicitly by Kreuter (2018a, p. 193).
- 18.
We know of only two, early exceptions that did not catch on: very wide definitions that include other auxiliary data, thus going beyond ‘process’ (Kennickell et al., 2009, p. 1: sampling frame), and those distinguishing routine/process paradata from added/(interviewer-)observational paradata (Smith, 2011, p. 1f.).
- 19.
Perhaps unless repurposed (see definitional base Use): e.g., billing information from traveling interviewers may be used for cross-checking the so-called call records (see Sect. 3).
- 20.
‘Panel surveys’ interview the same respondents at multiple points in time (‘waves’), e.g., annually.
- 21.
If fixed they are considered metadata.
- 22.
These are examples of paradata not accruing in the field phase. They are rarely released.
- 23.
Data volume may be a constraint for transmitting and storing audio or video recordings.
- 24.
For example, for ratings: a 5-point scale, each point labeled, ordered from low to high, and equidistant.
- 25.
Problematic are components that are overlapping or nonsequential. For instance, summing response times over the seemingly additive four question–answer process components produces an overestimate when the participant starts responding before the interviewer finishes the question. Clever programming can sometimes solve such problems.
- 26.
This includes the subtle ‘data/target leakage’ (Ghani & Schierholz, 2020, ch. 7.8.1).
- 27.
Ditto for what is available before contacts are attempted.
- 28.
There is more information for refusals and dropped contacts (Sakshaug & Kreuter, 2011).
- 29.
See Schouten et al. (2017, ch. 2.2). ASD (Wagner, 2008) aims to tailor survey design to individuals/groups: e.g., case prioritization (with paradata-predicted propensities: Wagner et al., 2012), call timing (time successful in previous wave: Kreuter & Müller, 2015), contact strategies, mode, and interviewer type (Tourangeau, 2021, ch. 2). In contrast, RSD (Groves & Heeringa, 2006) tries different designs in prespecified early phases of the field phase to arrive at an optimal final design.
- 30.
‘Representativeness indicators’, using contact history paradata (Schouten et al., 2012).
- 31.
Accounting for each interviewer’s case difficulties with call histories (West & Groves, 2013).
- 32.
- 33.
- 34.
Requiring consent is reasonable but hard in practice: e.g., nonrespondents can hardly consent.
- 35.
Paradata, being largely invisible, are different from the substantive survey questions. The respondent gets to know each of these and can, for each item, choose not to respond or to discontinue.
- 36.
However, the reliance on a few, underfunded or volunteer ‘research software engineers’ is very worrisome, especially as these systems must coevolve with technology and society.
- 37.
See Couper and Kreuter (2013) on response times. Multiple observations belonging to the same unit or interviews by the same interviewer are correlated and not isolated, independent data points.
- 38.
Time requirements can be sizable: e.g., 15–20 minutes per interview (West, 2018b, p. 541).
- 39.
Conversely, West and Sinibaldi (2013, p. 344) did not find that rewarding each contact attempt induced interviewers to overreport calls.
References
AAPOR (2016). Standard definitions: Final dispositions of case codes and outcome rates for surveys (9th ed.). The American Association for Public Opinion Research. https://aapor.org/wp-content/uploads/2022/11/Standard-Definitions20169theditionfinal.pdf
AAPOR (2021). AAPOR Code of Professional Ethics and Practices. The American Association for Public Opinion Research. https://aapor.org/wp-content/uploads/2022/12/AAPOR-2020-Code_FINAL_APPROVED.pdf. Revised April 2021.
Bender, S., Jarmin, R. S., Kreuter, F., & Lane, J. (2020). Privacy and confidentiality. In I. Foster, R. Ghani, R. S. Jarmin, F. Kreuter, & J. Lane (Eds.), Big data and social science (2nd ed., Chap. 12). CRC Press. https://textbook.coleridgeinitiative.org.
Biemer, P. P., Chen, P., & Wang, K. (2013). Using level-of-effort paradata in non-response adjustments with application to field surveys. Journal of the Royal Statistical Society: Series A (Statistics in Society), 176(1), 147–168.
Blom, A. G., Herzing, J. M. E., Cornesse, C., Sakshaug, J. W., Krieger, U., & Bossert, D. (2017). Does the recruitment of offline households increase the sample representativeness of probability-based online panels? Evidence from the German Internet Panel. Social Science Computer Review, 35(4), 498–520.
Bradburn, N. M. (2016). Surveys as social interactions. Journal of Survey Statistics and Methodology, 4(1), 94–109.
Bradburn, N. M., Sudman, S., & Wansink, B. (2004). Asking questions: The definitive guide to questionnaire design. Jossey-Bass, Wiley.
Callegaro, M. (2013). Paradata in web surveys. In F. Kreuter (Ed.), Improving surveys with paradata: Analytic uses of process information. Wiley.
Callegaro, M., Manfreda, K. L., & Vehovar, V. (2015). Web survey methodology. Sage
Charoenruk, N., & Olson, K. (2018). Do listeners perceive interviewers? Attributes from their voices and do perceptions differ by question type? Field Methods, 30(4), 312–328.
Chen, Z., Cernat, A., & Shlomo, N. (2022). Predicting web survey breakoffs using machine learning models. Social Science Computer Review, 41, 573–591.
Connors, E. C., Krupnikov, Y., & Ryan, J. B. (2019). How transparency affects survey responses. Public Opinion Quarterly, 83(S1), 185–209.
Conrad, F. G., Broome, J. S., Benkí, J. R., Kreuter, F., Groves, R. M., Vannette, D., & McClain, C. (2013). Interviewer speech and the success of survey invitations. Journal of the Royal Statistical Society: Series A (Statistics in Society), 176(1), 191–210.
Conrad, F. G., Keusch, F., & Schober, M. F. (2021). New data in social and behavorial research. Public Opinion Quarterly, 85(S1), 253–263. Introduction to Special Issue: New Data in Social and Behavioral Research.
Conrad, F. G., Schober, M. F., & Coiner, T. (2007). Bringing features of human dialogue to web surveys. Applied Cognitive Psychology, 21(2), 165–187.
Conrad, F. G., Tourangeau, R., Couper, M. P., & Zhang, C. (2017). Reducing speeding in web surveys by providing immediate feedback. Survey Research Methods, 11(1), 45–61.
Cornesse, C., & Schaurer, I. (2021). The long-term impact of different offline population inclusion strategies in probability-based online panels: Evidence from the german internet panel and the GESIS panel. Social Science Computer Review, 39(4), 687–704.
Couper, M., & Kreuter, F. (2013). Using paradata to explore item level response times in surveys. Journal of the Royal Statistical Society: Series A (Statistics in Society), 176(1), 271–286.
Couper, M. P. (1998). Measuring survey quality in a CASIC environment. In Proceedings of the Survey Research Methods Section of the American Statistical Association, American Statistical Association (pp. 41–49). Joint Statistical Meetings of the American Statistical Association.
Couper, M. P. (2000). Usability evaluation of computer-assisted survey instruments. Social Science Computer Review, 18(4), 384–396.
Couper, M. P. (2017a). New developments in survey data collection. Annual Review of Sociology, 43, 121–145.
Couper, M. P. (2017b). Birth and diffusion of the concept of paradata. Advances in Social Research, 18. https://www.jasr.or.jp/english/JASR_Birth%20and%20Diffusion%20of%20the%20Concept%20of%20Paradata.pdf. English manuscript by Mick P. Couper, page numbers refer to pdf file.
Couper, M. P., Antoun, C., & Mavletova, A. (2017). Mobile web surveys. In P. P. Biemer, E. D. de Leeuw, S. Eckman, B. Edwards, F. Kreuter, L. E. Lyberg, N. C. Tucker, & B. T. West (Eds.), Total survey error in practice (pp. 133–154). Wiley.
Couper, M. P., & Peterson, G. J. (2017). Why do web surveys take longer on smartphones? Social Science Computer Review, 35(3), 357–377.
Couper, M. P., & Singer, E. (2013). Informed consent for web paradata use. Survey Research Methods, 7(1), 57–67.
De Boeck, P., & Jeon, M. (2019). An overview of models for response times and processes in cognitive tests. Frontiers in Psychology, 10, 1–11.
Durrant, G. B., Smith, P. W., & Maslovskaya, O. (2019). Investigating call record data using sequence analysis to inform adaptive survey designs. International Journal of Social Research Methodology, 22(1), 37–54.
Eckman, S. (2013). Paradata for coverage research. In F. Kreuter (Ed.), Improving surveys with paradata: Analytic uses of process information (pp. 97–116). Wiley.
Eckman, S. (2016). Does the inclusion of non-internet households in a web panel reduce coverage bias? Social Science Computer Review, 34(1), 41–58.
Eckman, S. (2017). Interviewers’ expectations of response propensity can introduce nonresponse bias in survey data. Statistical Journal of the IAOS, 33(1), 231–234.
Edwards, B., Maitland, A., & Connor, S. (2017). Measurement error in survey operations management: Detection, quantification, visualization, and reduction. In P. P. Biemer, E. D. de Leeuw, S. Eckman, B. Edwards, F. Kreuter, L. E. Lyberg, N. C. Tucker, & B. T. West (Eds.), Total survey error in practice (pp. 253–277). Wiley.
Edwards, B., Sun, H., & Hubbard, R. (2020). Behavior change techniques for reducing interviewer contributions to total survey error. In K. Olson, J. D. Smyth, J. Dykema, A. L. Holbrook, F. Kreuter, & B. T. West (Eds.), Interviewer effects from a total survey error perspective (pp. 77–90). CRC Press.
Enqvist, L. (2024). Paradata as a tool for legal analysis—Utilising data on data related processes. In I. Huvila, L. Andersson, & O. Sköld (Eds.), Perspectives on paradata: Research and practice of documenting process knowledge. Springer.
Felderer, B., & Blom, A. G. (2022). Acceptance of the automated online collection of geographical information. Sociological Methods & Research, 51(2), 866–886.
Fernández-Fontelo, A., Kieslich, P. J., Henninger, F., Kreuter, F., & Greven, S. (2023). Predicting question difficulty in web surveys: A machine learning approach based on mouse movement features. Social Science Computer Review, 41(1), 141–162.
Ghani, R., & Schierholz, M. (2020). Machine learning. In I. Foster, R. Ghani, R. S. Jarmin, F. Kreuter, & J. Lane (Eds.), Big data and social science (Chap. 7, 2nd ed.). CRC Press. https://textbook.coleridgeinitiative.org
Groves, R. M. (2011). Three eras of survey research. Public Opinion Quarterly, 75(5), 861–871.
Groves, R. M., Fowler Jr., F. J., Couper, M. P., Lepkowski, J. M., Singer, E., & Tourangeau, R. (2009). Survey methodology (2nd ed.). Wiley.
Groves, R. M., & Heeringa, S. G. (2006). Responsive design for household surveys: Tools for actively controlling survey errors and costs. Journal of the Royal Statistical Society: Series A (Statistics in Society), 169(3):, 439–457.
Henninger, F., Kieslich, P. J., Fernández-Fontelo, A., Greven, S., & Kreuter, F. (2022a). Privacy attitudes toward mouse-tracking paradata collection. Preprint, SocArXiv. https://osf.io/preprints/socarxiv/6weqx/. Version from March 15, 2022.
Henninger, F., Shevchenko, Y., Mertens, U. K., Kieslich, P. J., & Hilbig, B. E. (2022b). lab.js: A free, open, online study builder. Behavior Research Methods. Preprint at https://doi.org/10.5281/zenodo.597045
Hill, C. A., Biemer, P. P., Buskirk, T. D., Japec, L., Kirchner, A., Kolenikov, S., & Lyberg, L. E. (2021). Big data meets survey science: A collection of innovative methods. Wiley.
Höhne, J. K., Cornesse, C., Schlosser, S., Couper, M. P., & Blom, A. G. (2020a). Looking up answers to political knowledge questions in web surveys. Public Opinion Quarterly, 84(4), 986–999.
Höhne, J. K., Schlosser, S., Couper, M. P., & Blom, A. G. (2020b). Switching away: Exploring on-device media multitasking in web surveys. Computers in Human Behavior, 111, 106417.
Holbrook, A. L., Anand, S., Johnson, T. P., Cho, Y. I., Shavitt, S., Chávez, N., & Weiner, S. (2014). Response heaping in interviewer-administered surveys: Is it really a form of satisficing? Public Opinion Quarterly, 78(3), 591–633.
Jackson, M. T., McPhee, C. B., & Lavrakas, P. J. (2020). Using response propensity modeling to allocate noncontingent incentives in an address-based sample: Evidence from a national experiment. Journal of Survey Statistics and Methodology, 8(2), 385–411.
Jacobs, L., Loosveldt, G., & Beullens, K. (2020). Do interviewer assessments of respondents’ performance accurately reflect response behavior? Field Methods, 32(2), 193–212.
James, G., Witten, D., Hastie, T., & Tibshirani, R. (2021). An introduction to statistical learning (2nd Ed.). Springer. https://www.statlearning.com. First Printing August 04, 2021. Pdf Accessed August 30, 2021.
Jans, M. E. (2010). Verbal Paradata and Survey Error: Respondent Speech, Voice, and Question-Answering Behavior can Predict Income Item Nonresponse. PhD Thesis, University of Michigan, Ann Arbor, MI. https://isr.umich.edu/wp-content/uploads/2017/09/jans-dissertation.pdf
Japec, L., Kreuter, F., Berg, M., Biemer, P. P., Decker, P., Lampe, C., Lane, J., O’Neil, C., & Usher, A. (2015). Big data in survey research: AAPOR task force report. Public Opinion Quarterly, 79(4), 839–880.
Johnson, M. (2004). Timepieces: Components of survey question response latencies. Political Psychology, 25(5), 679–702.
Kennickell, A. B., Mulrow, E., & Scheuren, F. (2009). Paradata or Process Modeling for Inference, 2009. In Presented at the Modernization of Statistics Production Conference, Stockholm, Sweden, 2009/11/02-04.
Keusch, F., Struminskaya, B., Eckman, S., & Guyer, H. M. (2024). Data Collection with Wearables, Apps, and Sensors. CRC Press. In preparation.
Kieslich, P. J., Henninger, F., Wulff, D. U., Haslbeck, J. M. B., & Schulte-Mecklenbeck, M. (2019). Mouse-tracking: A practical guide to implementation and analysis. In M. Schulte-Mecklenbeck, A. Kühberger, & J. G. Johnson (Eds.), A handbook of process tracing methods (2nd ed., pp. 111–130). Routledge. https://doi.org/10.31234/osf.io/zuvqa
Kirchner, A., Olson, K., & Smyth, J. D. (2017). Do interviewer postsurvey evaluations of respondents’ engagement measure who respondents are or what they do? A behavior coding study. Public Opinion Quarterly, 81(4), 817–846.
Kreuter, F. (2013). Improving surveys with paradata: Introduction. In F. Kreuter (Ed.), Improving surveys with paradata: Analytic uses of process information (pp. 1–9). Wiley.
Kreuter, F. (2018a). Getting the most out of paradata. In D. L. Vannette & J. A. Krosnick (Eds.), The palgrave handbook of survey research (pp. 193–198). Palgrave Macmillan/Springer.
Kreuter, F. (2018b). Paradata. In D. L. Vannette & J. A. Krosnick (Eds.), The palgrave handbook of survey research (pp. 529–535). Palgrave Macmillan/Springer.
Kreuter, F., & Casas-Cordero, C. (2010). Paradata. RatSWD Working Papers series Working Paper No. 136, German Data Forum (RatSWD). https://www.konsortswd.de/wp-content/uploads/RatSWD_WP_136.pdf. Accessed Jun 24, 2022.
Kreuter, F., Couper, M. P., & Lyberg, L. (2010a). The use of paradata to monitor and manage survey data collection. In Proceedings of the Survey Research Methods Section, American Statistical Association (pp. 282–296). Joint Statistical Meetings of the American Statistical Association.
Kreuter, F., & Jäckle, A. (2008). Are Contact Protocol Data Informative for Potential Nonresponse and Nonresponse Bias in Panel Studies? A Case Study from the Northern Ireland Subset of the British Household Panel Survey. Paper Presented at the Panel Survey Methods Workshop, University of Essex, Colchester, UK, 2008.
Kreuter, F., & Müller, G. (2015). A note on improving process efficiency in panel surveys with paradata. Field Methods, 27(1), 55–65.
Kreuter, F., & Olson, K. (2011). Multiple auxiliary variables in nonresponse adjustment. Sociological Methods & Research, 40(2), 311–332.
Kreuter, F., Olson, K., Wagner, J. R., Yan, T., Ezzati-Rice, T. M., Casas-Cordero, C., Lemay, M., Peytchev, A., Groves, R. M., & Raghunathan, T. E. (2010b). Using proxy measures and other correlates of survey outcomes to adjust for non-response: Examples from multiple surveys. Journal of the Royal Statistical Society: Series A (Statistics in Society), 173(2), 389–407.
Kühne, S. (2018). From strangers to acquaintances? Interviewer continuity and socially desirable responses in panel surveys. Survey Research Methods, 12(2), 121–146.
Kunz, T., Landesvatter, C., & Gummer, T. (2020a). Informed consent for paradata use in web surveys. International Journal of Market Research, 62(4), 396–408.
Kunz, T. C., Beuthner, C., Hadler, P., Roßmann, J., & Schaurer, I. (2020b). Informing about web paradata collection and use. GESIS Survey Guidelines, GESIS – Leibniz-Institute for the Social Sciences, Mannheim, Germany. https://doi.org/10.15465/gesis-sg_036
Kyllonen, P. C., & Zu, J. (2016). Use of response time for measuring cognitive ability. Journal of Intelligence, 4(4), 14.
Lugtig, P., & Toepoel, V. (2016). The use of PCs, smartphones, and tablets in a probability-based panel survey: Effects on survey measurement error. Social Science Computer Review, 34(1), 78–94.
Lyberg, L. (2011). The Paradata Concept in Survey Research. https://csdiworkshop.org/wp-content/uploads/2020/03/Lybert2011CSDI.pdf. Presented at CSDI Workshop in London, UK, March 24, 2011. Pdf Accessed Jun 24, 2022.
Lynn, P. (2003). PEDAKSI: Methodology for collecting data about survey non-respondents. Quality & Quantity, 37(3), 239–261.
Maitland, A., & Presser, S. (2018). How do question evaluation methods compare in predicting problems observed in typical survey conditions? Journal of Survey Statistics and Methodology, 6(4), 465–490.
Matjašič, M., Vehovar, V., & Manfreda, K. L. (2018). Web survey paradata on response time outliers: A systematic literature review. Advances in Methodology and Statistics (Metodološki zvezki), 15(1), 23–41.
Matthijsse, S. M., De Leeuw, E. D., & Hox, J. J. (2015). Internet panels, professional respondents, and data quality. Methodology: European Journal of Research Methods for the Behavioral and Social Sciences, 11(3), 81–88.
Mayerl, J. (2013). Response latency measurement in surveys. Detecting strong attitudes and response effects. Survey Methods: Insights From the Field, 27, 1–26.
Mayerl, J., Sellke, P., & Urban, D. (2005). Analyzing cognitive processes in CATI-Surveys with response latencies: An empirical evaluation of the consequences of using different baseline speed measures. Schriftenreihe des Instituts für Sozialwissenschaften der Universität Stuttgart -SISS- (Vol. 2/2005). Universität Stuttgart, Fak. 10 Wirtschafts- und Sozialwissenschaften, Institut für Sozialwissenschaften, Stuttgart, Germany. https://nbn-resolving.org/urn:nbn:de:0168-ssoar-117346
McClain, C. A., Couper, M. P., Hupp, A. L., Keusch, F., Peterson, G., Piskorowski, A. D., & West, B. T. (2019). A typology of web survey paradata for assessing total survey error. Social Science Computer Review, 37(2), 196–213.
McLauchlan, C., & Schonlau, M. (2016). Are final comments in web survey panels associated with next-wave attrition? Survey Research Methods, 10(3), 211–224.
Meitinger, K., Ackermann-Piek, D., Blohm, M., Edwards, B., Gummer, T., & Silber, H. (2020). Special Issue: Fieldwork Monitoring Strategies for Interviewer-Administered Surveys. Survey Methods: Insights from the Field. https://core.ac.uk/download/pdf/343333745.pdf, https://surveyinsights.org/?p=13732
Minderop, I., & Weiß, B. (2023). Now, later, or never? Using response-time patterns to predict panel attrition. International Journal of Social Research Methodology, 26(6), 693–706. Published online first.
Mittereder, F. K. (2019). Predicting and Preventing Breakoff in Web Surveys. Dissertation, University of Michigan, Ann Arbor, MI. https://deepblue.lib.umich.edu/handle/2027.42/149963
Mittereder, F. K., & West, B. T. (2022). A Dynamic survival modeling approach to the prediction of web survey breakoff. Journal of Survey Statistics and Methodology, 10, 979–1004.
Mohadjer, L., & Edwards, B. (2018). Paradata and dashboards in PIAAC. Quality Assurance in Education, 26(2), 263–277.
Mohler, P. P., Pennell, B.-E., & Hubbard, F. (2012). Survey documentation: Toward professional knowledge management in sample surveys. In E. D. De Leeuw, J. Hox, & D. Dillman (Eds.), International handbook of survey methodology (pp. 403–420). Routledge.
National Academies of Sciences, Engineering, and Medicine (NAS) (2022). Transparency in statistical information for the national center for science and engineering statistics and all federal statistical agencies. The National Academies Press. https://doi.org/10.17226/26360
Nicolaas, G. (2011). Survey paradata: A review. Discussion Paper NCRM/017, ESRC National Centre for Research Methods Review paper. https://eprints.ncrm.ac.uk/id/eprint/1719
Olbrich, L., Beste, J., Sakshaug, J. W., & Schwanhäuser, S. (2022). The Influence of Audio Recordings on Interviewer Behavior. Poster Presented at LMU Munich Department of Statistics Summer Retreat, 2022/07/08-09.
Olson, K. (2006). Survey participation, nonresponse bias, measurement error bias, and total bias. Public Opinion Quarterly, 70(5), 737–758.
Olson, K. (2013). Paradata for nonresponse adjustment. The Annals of the American Academy of Political and Social Science, 645(1), 142–170.
Olson, K., & Parkhurst, B. (2013). Collecting paradata for measurement error evaluations. In F. Kreuter (Ed.), Improving surveys with paradata: Analytic uses of process information (pp. 43–72). Wiley.
Peytchev, A. (2012). Multiple imputation for unit nonresponse and measurement error. Public Opinion Quarterly, 76(2), 214–237.
Plewis, I., Calderwood, L., & Mostafa, T. (2017). Can interviewer observations of the interview predict future response? Methods, Data, Analyses, 11(1), 1–16.
Purdam, K., Sakshaug, J. W., Bourne, M., & Bayliss, D. (2020). Understanding ‘Don’t Know’ answers to survey questions – An international comparative analysis using interview paradata. Innovation: The European Journal of Social Science Research, 1–23. https://www.tandfonline.com/doi/abs/10.1080/13511610.2020.1752631
Revilla, M., & Ochoa, C. (2015). What are the links in a web survey among response time, quality, and auto-evaluation of the efforts done? Social Science Computer Review, 33(1), 97–114.
Roßmann, J., & Gummer, T. (2016). Using paradata to predict and correct for panel attrition. Social Science Computer Review, 34(3), 312–332.
Sakshaug, J. W. (2013). Using paradata to study response to within-survey requests. In F. Kreuter (Ed.), Improving surveys with paradata: Analytic uses of process information (pp. 171–190). Wiley.
Sakshaug, J. W., & Kreuter, F. (2011). Using paradata and other auxiliary data to examine mode switch nonresponse in a “Recruit-and-Switch” telephone survey. Journal of Official Statistics, 27(2), 339–357.
Sakshaug, J. W., & Struminskaya, B. (2023). Call for Papers: Augmenting Surveys with Paradata, Administrative Data, and Contextual Data. A Special Issue of Public Opinion Quarterly. https://academic.oup.com/poq/pages/call-for-papers-augmenting-surveys
Sana, M., & Weinreb, A. A. (2008). Insiders, outsiders, and the editing of inconsistent survey data. Sociological Methods & Research, 36(4), 515–541.
Scheuren, F. (2001). Macro and micro paradata for survey assessment. In T. Black, K. Finegold, A. B. Garrett, A. Safir, F. Scheuren, K. Wang, & D. Wissoker (Eds.), 1999 NSAF Collection of Papers, pages 2C–1–2C–15. Urban Institute. https://www.urban.org/sites/default/files/publication/61596/410138---NSAF-Collection-of-Papers.PDF
Schlosser, S., & Höhne, J. K. (2020). ECSP – Embedded Client Side Paradata. Note: the 2020 version is an expansion of the 2016 and 2018 versions. https://doi.org/10.5281/zenodo.3782592
Schouten, B., Bethlehem, J., Beullens, K., Kleven, Ø., Loosveldt, G., Luiten, A., Rutar, K., Shlomo, N., & Skinner, C. (2012). Evaluating, comparing, monitoring, and improving representativeness of survey response through r-indicators and partial R-indicators. International Statistical Review, 80(3), 382–399.
Schouten, B., Peytchev, A., & Wagner, J. R. (2017). Adaptive survey design. CRC Press.
Schwanhäuser, S., Sakshaug, J. W., & Kosyakova, Y. (2022). How to catch a falsifier: Comparison of statistical detection methods for interviewer falsification. Public Opinion Quarterly, 86(1), 51–81.
Schwarz, H., Revilla, M., & Struminskaya, B. (2022). Do previous survey experience and participating due to an incentive affect response quality? Evidence from the CRONOS panel. Journal of the Royal Statistical Society: Series A (Statistics in Society), 185, 1–23.
Sendelbah, A., Vehovar, V., Slavec, A., & Petrovčič, A. (2016). Investigating respondent multitasking in web surveys using paradata. Computers in Human Behavior, 55, 777–787.
Shlomo, N. (2018). Statistical disclosure limitation: New directions and challenges. Journal of Privacy and Confidentiality, 8(1). https://journalprivacyconfidentiality.org/index.php/jpc/article/view/684
Sinibaldi, J., Trappmann, M., & Kreuter, F. (2014). Which is the better investment for nonresponse adjustment: Purchasing commercial auxiliary data or collecting interviewer observations? Public Opinion Quarterly, 78(2), 440–473.
Smith, T. W. (2011). The report of the international workshop on using multi-level data from sample frames, auxiliary databases, paradata and related sources to detect and adjust for nonresponse bias in surveys. International Journal of Public Opinion Research, 23(3), 389–402.
Stern, M. J. (2008). The use of client-side paradata in analyzing the effects of visual layout on changing responses in web surveys. Field Methods, 20(4), 377–398.
Sturgis, P., Maslovskaya, O., Durrant, G., & Brunton-Smith, I. (2021). The interviewer contribution to variability in response times in face-to-face interview surveys. Journal of Survey Statistics and Methodology, 9(4), 701–721.
Sun, H., Conrad, F. G., & Kreuter, F. (2021). The relationship between interviewer-respondent rapport and data quality. Journal of Survey Statistics and Methodology, 9(3), 429–448.
Tourangeau, R. (2021). Science and survey management. Survey Methodology, 47(1), 3–29.
Tourangeau, R., & Yan, T. (2007). Sensitive questions in surveys. Psychological Bulletin, 133(5), 859–883.
Vardigan, M., Granda, P. A., & Hoelter, L. F. (2016). Documenting survey data across the life cycle. In C. Wolf, D. Joye, T. W. Smith, & Y.-c. Fu (Eds.), The SAGE handbook of survey methodology (pp. 443–459). SAGE.
Wagner, J. R. (2008). Adaptive Survey Design to Reduce Nonresponse Bias. Dissertation, University of Michigan, Ann Arbor, MI, 2008. https://deepblue.lib.umich.edu/handle/2027.42/60831
Wagner, J. R. (2013a). Adaptive contact strategies in telephone and face-to-face surveys. Survey Research Methods, 7(1), 45–55.
Wagner, J. R. (2013b). Using paradata-driven models to improve contact rates in telephone and face-to-face surveys. In F. Kreuter (Ed.), Improving surveys with paradata: analytic uses of process information (pp. 145–170). Wiley.
Wagner, J. R. (2019). Estimation of survey cost parameters using paradata. Survey Practice, 12(1).
Wagner, J. R., Olson, K., & Edgar, M. (2017). The utility of GPS data in assessing interviewer travel behavior and errors in level-of-effort paradata. Survey Research Methods, 11(3), 218–233.
Wagner, J. R., West, B. T., Kirgis, N., Lepkowski, J. M., Axinn, W. G., & Ndiaye, S. K. (2012). Use of paradata in a responsive design framework to manage a field data collection. Journal of Official Statistics, 28(4), 477–499.
West, B. T. (2011). Paradata in survey research. Survey Practice, 4(4), 1–8.
West, B. T. (2013a). The effects of error in paradata on weighting class adjustments: A simulation study. In F. Kreuter (Ed.), Improving surveys with paradata: Analytic uses of process information (pp. 361–388). Wiley.
West, B. T. (2013b). An examination of the quality and utility of interviewer observations in the national survey of family growth. Journal of the Royal Statistical Society. Series A (Statistics in Society), 176(1), 211–225.
West, B. T. (2018a). Collecting interviewer observations to augment survey data. In D. L. Vannette & J. A. Krosnick (Eds.), The palgrave handbook of survey research (pp. 211–215). Palgrave Macmillan/Springer.
West, B. T. (2018b). Interviewer observations. In D. L. Vannette & J. A. Krosnick (Eds.), The palgrave handbook of survey research (pp. 537–548). Palgrave Macmillan/Springer.
West, B. T., & Blom, A. G. (2017). Explaining interviewer effects: A research synthesis. Journal of Survey Statistics and Methodology, 5(2), 175–211.
West, B. T., & Groves, R. M. (2013). A propensity-adjusted interviewer performance indicator. Public Opinion Quarterly, 77(1), 352–374.
West, B. T., & Li, D. (2019). Sources of variance in the accuracy of interviewer observations. Sociological Methods & Research, 48(3), 485–533.
West, B. T., & Sinibaldi, J. (2013). The quality of paradata: A literature review. In F. Kreuter (Ed.), Improving surveys with paradata: Analytic uses of process information (pp. 339–359). Wiley.
West, B. T., & Trappmann, M. (2019). Effective strategies for recording interviewer observations: Evidence from the PASS study in Germany. Survey Methods: Insights from the Field.
West, B. T., Wagner, J. R., Coffey, S., & Elliott, M. R. (2023). Deriving priors for Bayesian prediction of daily response propensity in responsive survey design: Historical data analysis versus literature review. Journal of Survey Statistics and Methodology, 11(2), 367–392.
Wilkinson, L. R., Ferraro, K. F., & Kemp, B. R. (2017). Contextualization of survey data: What do we gain and does it matter? Research in Human Development, 14(3), 234–252.
Wulff, D. U., Kieslich, P. J., Henninger, F., Haslbeck, J., & Schulte-Mecklenbeck, M. (2021). Movement tracking of cognitive processes: A tutorial using mousetrap. Preprint. PsyArxiv. https://doi.org/10.31234/osf.io/v685r
Yan, T. (2021). Consequences of asking sensitive questions in surveys. Annual Review of Statistics and Its Application, 8, 109–127.
Yan, T., & Olson, K. (2013). Analyzing paradata to investigate measurement error. In F. Kreuter (Ed.), Improving surveys with paradata: Analytic uses of process information (pp. 73–96). Wiley.
Acknowledgements
The authors would like to thank the editors, two anonymous referees, the other authors, and members of the FK2RG research group for their comments. The authors would also like to thank the editors for an extraordinarily smooth, accommodating, and stimulating process.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
Copyright information
© 2024 The Author(s)
About this chapter
Cite this chapter
Schenk, P.O., Reuß, S. (2024). Paradata in Surveys. In: Huvila, I., Andersson, L., Sköld, O. (eds) Perspectives on Paradata. Knowledge Management and Organizational Learning, vol 13. Springer, Cham. https://doi.org/10.1007/978-3-031-53946-6_2
Download citation
DOI: https://doi.org/10.1007/978-3-031-53946-6_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-53945-9
Online ISBN: 978-3-031-53946-6
eBook Packages: Business and ManagementBusiness and Management (R0)