Overview of collider bias

Collider bias is somewhat of a paradox among topics related to causal inference. Owing to its importance, it has generated a considerable amount of scholarship and discussion (Pearl, 1995, 2009). Relative to other forms of bias, though, collider bias remains comparatively obscure, at least in certain fields (Rohrer, 2018). A flurry of research in the wake of the SARS-CoV-2 pandemic has helped reignite interest in collider bias, at least briefly (Griffith et al., 2020; Holmberg et al., 2022). Recent attention notwithstanding, discussion of collider bias is somewhat of a rarity in the social and behavioral sciences (Gratz, 2019; Rohrer, 2018), particularly among criminologists.

Several conditions can give rise to collider bias. The first to mention is a situation when (1) an exposure variable (X) and outcome (Y) both independently cause a third variable (C) and (2) that third variable is included as a control in a statistical model (Holmberg et al., 2022; Rohrer, 2018). Consider a scenario in which a researcher may be interested in exploring possible causal association between an exposure (X) and outcome (A). In reality, X exerts no causal impact on A; however, if the researcher conditions on outcome B—which is causally impacted by both X and A—a correlation between X and A can be artificially induced (see Fig. 1). The second scenario capable of introducing collider bias is when there is non-random attrition in a sample, particularly when exposure and outcome variables directly cause the attrition. The de facto consequence is that one can end up “conditioning” on a collider, regardless of whether a measured covariate is included in the analytical equation (Lee, 2012; Pearl, 2009).

Fig. 1
figure 1

Directed acyclic graph (DAG) representing causal relationships. In the graph, outcome B represents the collider in the relationship between the exposure and outcome A. Though no causal relationship exists between the exposure and outcome A, conditioning upon outcome B may introduce a spurious relationship between the exposure and outcome A

Criminologists are widely familiar with other forms of related biases, with confounder bias generally attracting the most attention (Elwert & Winship, 2014). Also known as omitted variable bias, this occurs when an exposure (X) and outcome (Y) are both caused by a third variable (Z), and this third variable is unaccounted for in the study design (see Fig. 1). Bias of this variety can prompt the conclusion that X causes Y, when in reality the two variables correlate purely because they share the influence of Z (Elwert & Winship, 2014). Similarly, selection bias refers to a scenario where selection into a sample is correlated with X or Y (Elwert & Winship, 2014). Collider bias is a specific variety of selection bias occurring when some endogenous variable (C) is caused by both X and Y (Elwert & Winship, 2014; Greenland, 2003).

Though not frequently discussed, criminologists should be well-acquainted with collider bias, as it is implicated in many of the field’s more pressing areas of inquiry. For example, one of the central discussions in criminology focuses on challenges in the causal modeling of social learning and disentangling interpersonal causal effects (each individual in a pair influencing the other, e.g., social learning), from homophilic “birds of a feather” effects (individuals who are connected because of their similarities, rather than similar because they connected) (Armstrong, 2003). When examining the associations between individual traits, deviant peer association, and delinquency, association with deviant peers can act as a collider, where individuals’ attributes, traits, and characteristics will cause the formation of a connection prior to the hypothesized effects of socialization over time, resulting in a false impression or magnification of social learning effects (Shalizi & Thomas, 2011). Indeed, research has long demonstrated the tendency of social networks to assort themselves into homophilic relationships based on a range of individual characteristics (Angrist, 2014; Graif et al., 2017; Grund & Densley, 2015; McPherson et al., 2001; Smith & Papachristos, 2016). The very nature of many social networks conditions on collider variables via selection into networks, resulting in bias when modeling peer influence. This form of collider bias is particularly problematic in social network analyses but can be present in any model examining peer influence (or “social contagion”) and is therefore likely quite prevalent in criminological research.

The labeling perspective’s efforts to differentiate secondary deviance and secondary sanctioning effects are another area of criminological research vulnerable to collider bias. Labeling research has become increasingly interested in disentangling secondary sanctioning and secondary deviance as causal outcomes of labeling events (see Liberman et al., 2014; Novak, 2022; Pratt et al., 2016). Labeling events are considered causes of both secondary deviance and secondary sanctioning. At the same time, secondary deviance and sanctioning may exert causal effects on one another and may therefore function as colliders in examinations of the perspective’s arguments. If measures of both secondary deviance and secondary sanctioning are included in models examining the effects of labeling, results may reflect a biased estimation of the correlation between labeling and secondary outcomes.

As a more tangible example, imagine we have a population of 100 individuals (see Fig. 2). Of these 100, 52 have previously been labelled as an offender and 20 are about to commit an offense. The probability of offending among the labeled group is 0.23 [P(offend| label) = 12/52 = 0.23], while the probability of offending in the unlabeled group is 0.17 [P(offend| no label) = 8/48 = 0.17]. The remaining 80 individuals, none of whom will commit an offense, are split evenly between the labeled group and unlabeled group. The risk of offending is greater in the labeled group, consistent with labelling theory. The offending population, for a variety of reasons, can be difficult to access. For the sake of convenience, we wait for a subset of this population to be arrested, which affords us access to a sample of individuals. This is where collider bias is introduced. Both being labelled as an offender and committing an offense are more likely to lead to arrest. All 12 individuals who were previously labelled as an offender and then committed an offense were apprehended, placing them in our arrest sample. Half of the unlabeled individuals who committed an offense were arrested, and 4 out of 8 appear in our arrest sample. Meanwhile, 5 out of 40 of the labelled population were arrested despite having committed no offense. This study sample of arrestees consists of 21 individuals, and a total of 12 out of 17 labelled individuals committed an offense, so P(offend| label) = 0.71.

Fig. 2
figure 2

Illustrative example of collider bias in a simple simulated observational study of labelling. Collider bias is introduced by the notional researchers’ decision to only sample arrested individuals, thereby controlling out the influence of arrest. These data are fictional and were generated for illustrative purposes only

As there are no unlabeled non-offenders in the sample, all 4 unlabeled individuals in the arrested sample committed an offense, P(offend| no label) = 1. As a result, the risk ratio of offending conditional upon being labelled in this hypothetical arrestee population is 0.7, suggesting that being labelled as an offender reduces the likelihood of offending. Meanwhile, in the true population, the risk ratio is 1.44, suggesting labelling may causally increase offending behavior. All the more disconcerting, though, is the fact that the same bias would occur if we introduced arrest as a statistical control when analyzing the full population (for similarly constructed examples see Cole et al., 2010). In the pursuit of clarity, we constructed a simplistic scenario to illustrate the impact of collider bias. More complex and rigorous simulation studies reveal the complexities associated with quantifying collider bias effects, with some estimates suggesting relatively little bias results from colliders (Liu et al., 2012), and other estimates pointing to substantially more bias attributable to the introduction of colliders (Liu et al., 2012; Munafò et al., 2018; Whitcomb et al., 2009).

Exacerbating the problems of collider effects is the fact that controlling or conditioning on a variable is often what creates the problem in the first place. As we have tried to illustrate in Fig. 1, collider bias and confounder bias are procedurally opposed in the sense that the solution to collider bias—the removal of a collider from a set of control variables—can be in opposition to the solution to confounder bias, the introduction of an omitted control variable. Put differently, where confounders and mediators are statistically identical and can be understood similarly (MacKinnon et al., 2000), confounders and colliders are statistically antithetical (MacKinnon & Lamp, 2021).

Collider bias in observational research

In observational studies, collider bias can occur inadvertently by conditioning on a measure that occurred after an exposure and can result in a spurious association between an exposure and outcome. For example, if researchers are interested in learning about the relationship between parole supervision and re-offending, controlling for other post-release outcomes that may be causally related to parole supervision and re-offending such as enrollment in treatment programs, substance use, deviant peer association, educational attainment, employment, etc., could introduce spurious associations between parole supervision and re-offending. Assuming parole supervision and re-offending are causally related to included controls, a spurious or biased correlation may be induced between parole supervision and re-offending.

Collider bias may also occur in analyses using longitudinal data that, over time, have experienced some amount of sample attrition. Referring back to Fig. 1, an exposure and a given outcome, outcome A, may both be causally related to attrition, or outcome B. In the event attrition is caused by both an exposure and a given outcome, attrition may introduce collider bias into estimates. For example, research looking at the relationship between caregiver separation and offending behavior over time may be susceptible to collider bias, as caregiver separation may reflect household instability, which could cause study attrition. Similarly, individuals who engage in offending behavior may be more likely to experience attrition. Assuming no causal relationship between caregiver divorce or separation and offending behavior, conditioning upon attrition by restricting samples only to individuals with full data across waves may introduce collider bias and induce an association between caregiver separation and offending.

Collider bias in experimental research

While experiments have numerous desirable qualities, especially with respect to internal validity, they do not offer guaranteed protection from collider bias (Holmberg et al., 2022). Collider bias in experiments can occur via attrition, as well as by restricting samples to certain subgroups. To the extent that attrition from the study is non-random and attributable to both treatments (or their absence) and outcomes, collider bias occurs (Elwert & Winship, 2014). In a hypothetical experiment designed to estimate the effects of an afterschool program on delinquency, for example, restricting the sample to only individuals retained in the study may induce an association between program participation and delinquency. To understand why, recall that if both omission from the treatment program and delinquency cause attrition among participants, restricting the sample as described could lead to erroneous conclusions about program efficacy.

Collider bias can also be introduced into experimental designs by restricting samples to specific subgroups. For instance, researchers hoping to design a study examining the relationship between participation in substance use treatment and victimization may opt to restrict their experiment to only individuals not under criminal justice supervision. In doing so, they may inadvertently be conditioning on a collider, as both substance abuse treatment and victimization may cause offending behavior and/or criminal justice supervision. Conditioning on this collider could therefore introduce a spurious association between treatment and victimization, as treatment may cause supervision due to increased monitoring and surveillance, and victimization may cause supervision if victimization is experienced during the commission of an illegal behavior.

Approaches for dealing with collider bias

Collider bias is a vexing problem, and some of its sources are more easily addressed than others. When considering how to deal with collider bias introduced by variable selection, wider use of directed acyclic graphs (DAGs) may be of assistance (see Greenland et al., 1999; Lee, 2012; Rohrer, 2018; Schneider, 2020). DAGs are graphical illustrations which map causal relationships between variables (Pearl, 1995, 2009 see also Fig. 1). In the fields of criminal justice and criminology, researchers frequently make decisions about variables to be either included or excluded in models. DAGs force researchers, particularly those using secondary data, cross-sectional data, and/or multiple outcomes measured contemporaneously, to think through potential causal relationships, time-ordering, and the consequences of variable selection. Researchers should make decisions aimed at reducing bias where possible, understanding the exclusion of some variables may lead to confounder bias, while their inclusion may lead to collider bias and that the researcher may need to determine the “lesser evil” in deciding how the model should be specified (Rohrer, 2018, p. 39).

Certain methodologies designed to produce causal estimates with observational data can also prove useful for managing collider bias. Because these approaches address many threats from confounding variables, they reduce risk of inadvertently introducing colliders into models. One such approach involves a variant of sibling analyses, specifically models testing whether twin discordance on some variable tends to produce discordance on another (Frisell et al., 2012). In their most conservative form—analyzing only monozygotic (MZ) twins—these models account for all familial confounders, both genetic and environmental (McGue et al., 2010). This happens absent measured control variables as a result of the genetic and shared environmental overlap between MZ twins (Barnes et al., 2014; McAdams et al., 2021; McGue et al., 2010). As the composition of the sample inherently reduces the need for control variables, the possibility colliders are introduced into models unintentionally is decreased. Causal methodologies such as instrumental variable models, counterfactual approaches, difference-in-difference models, and regression discontinuity designs may offer similar benefits, in that they allow researchers to account for observed and/or unobserved confounders without risking the introduction of colliders into models (Akimova et al., 2021; Angrist, 2014).

Addressing collider bias in the causal modelling of social network data and learning-based hypotheses presents additional challenges (Lyons, 2011; Shalizi & Thomas, 2011). While some traditional approaches for addressing bias may assist with collider bias [i.e., counterfactual (Elwert & Christakis, 2008) or instrumental variable approaches (O’Malley et al., 2014)], several statistical models unique to network analysis have also been introduced as possible solutions. Shalizi and Thomas (2011) propose a model of tie formation and dissolution, aiming to identify instances where behaviors spread in response to tie formation and cease in response to tie dissolution, eliminating instances where an outcome is maintained by similar characteristics rather than social contagion (Krivitsky & Handcock, 2014). Ver Steeg and Galstyan (2011) borrow from quantum physics’ Bell inequalities to both estimate and rule out the influence of latent homophily bias, and VanderWeele (2011) tests for the presence of homophily bias in longitudinal data by calculating the sensitivity of the outcome on a lagged measure of the outcome. More recent advances in causal network analysis include the use of causal structural equation modelling. Causal SEM, in this case, explicitly models homophily via a targeted maximum likelihood estimation (TMLE) approach for both single time point interventions and observational data (van der Laan, 2014; Ogburn et al., 2022), as well as the application of g-computation to network data to allow for the modeling of direct and “spillover” effects within a network (Tchetgen Tchetgen et al., 2021; see An et al. 2022 for additional information).

Collider bias associated with sample selection and/or attrition is more challenging to address. As evidenced by recent research on COVID-19 (Griffith et al., 2020; Holmberg et al., 2022), generating study samples based on possible colliders can produce biased estimates; researchers should therefore avoid conditioning study eligibility in experimental designs on potential colliders where possible. Similarly, researchers conducting experimental studies should work to minimize attrition where possible (Elwert & Winship, 2014). Assuming attrition is completely random, its costs may be limited to reductions in statistical power. Assuming, as is more likely, attrition is not completely random, differential attrition can function as a collider and introduce bias into generated estimates. Though minimizing attrition is often easier said than done, doing so can help to reduce the influence of collider bias in experimental studies. Where appropriate, researchers might also consider the use of imputation strategies to address attrition-related collider bias, assuming appropriate imputation techniques and/or a sufficient number of imputed datasets (Munafò et al., 2018; Young & Johnson, 2010). Though dependent variables have not traditionally been imputed, research suggests, given appropriate imputation techniques, imputing dependent variables can be done and may help to reduce attrition-related collider bias (Munafò et al., 2018; Young & Johnson, 2010).

Absent an ability to strongly curtail or wholly eliminate collider bias via research design or model selection and specification, one of several strategies to quantify collider-related bias in produced estimates can be employed. If collider bias is deemed a threat, researchers should employ multiple regression models, iteratively testing the sensitivity of estimates to the inclusion and exclusion of theorized colliders. After testing sensitivity, researchers can consider quantifying the impact of collider bias in their models. Mitchell et al. (2022) provide a comprehensive review of newer approaches intended to help quantify bias associated with colliders in models, and while simply quantifying bias is not the same as reducing or preventing its impact in the first place, insight about how much bias is present in models may be valuable in understanding its impact on results.

That said, the importance of model specification cannot be underemphasized. The inclusion of variables in statistical models always has some type of consequence. The magnitude of those consequences can be particularly amplified, though, when relying on cross-sectional data or multiple outcomes measured contemporaneously. Such conditions are primed for complex problems to arise such that variables might represent confounders if omitted, or colliders if included (Frisell et al., 2012; McGue et al., 2010; Saunders et al., 2019). Navigating these scenarios involves careful considerations (Frisell et al., 2012; Saunders et al., 2019; Sjölander et al., 2012) and researchers should recognize the inclusion of variables in models, as well as sampling procedures, warrants as much consideration as is generally dedicated toward the identification of potential confounds. Wider reliance on DAGs in conjunction with other strategies covered here offers the promise of steady progress toward more robust science of crime and its attendant topics (Greenland et al., 1999).