A Comparison of Statistical Methods for Studying Interactions of Chemical Mixtures

Properly assessing the effects of environmental chemical exposures on disease risk remains a challenging problem in environmental epidemiology. Various analytic approaches have been proposed, but there are few papers that have compared the performance of different statistical methods on a single dataset. In this paper, we compare different regression-based approaches for estimating interactions between chemical mixture components using data from a case-control study on non-Hodgkin’s lymphoma. An analytic challenge is the high percentage of exposures that are below the limit of detection (LOD). Using imputation for LOD, we compare different Bayesian shrinkage prior approaches including an approach that incorporates the hierarchical principle where interactions are only included when main effects exist. Further, we develop an approach where main and interactive effects are represented by a series of distinct latent functions. We also fit the Bayesian kernel machine regression to these data. All of these approaches show little evidence of an interaction among the chemical mixtures when measurements below the LOD were imputed. The imputation approach makes very strong assumptions about the relationship between exposure and disease risk for measurements below the LOD. As an alternative, we show the results of an analysis where we model the exposure relationship with two parameters per mixture component; one characterizing the effect of being below the LOD and the other being a linear effect above the LOD. In this later analysis, we identify numerous strong interactions that were not identified in the analyses with imputation. This case study demonstrated the importance of developing new approaches for mixtures when the proportions of exposure measurements below the LOD are high.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material.If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.This is a U.S. Government work and not under copyright protection in the US; foreign copyright protection may apply 2024 ✉ Sungduk Kim, kims2@mail.nih.gov;Paul S. Albert, albertp@mail.nih.gov.

Introduction
In environmental epidemiology, interest often focuses on estimating the complex associations between environmental chemical mixtures and disease risk.Recently, various approaches have focused on characterizing higher-order interactions between mixture components and outcomes including regression-based [1,2], machine kernel regression [3], and latent class modeling approaches [4][5][6].Although these methods have been illustrated with actual chemical mixture data, there have been few papers that have compared the various approaches on an actual dataset.This article investigates the different modeling strategies using case-control study data examining the effects of chemical exposures on non-Hodgkin's Lymphoma (NHL).
There are analytic issues that make comparisons interesting.First, exposures can be nonlinear making inferences about interactions more complex.Second, many of the chemical exposure measurements were below the lower limit of detection (LOD).Third, some of the chemicals were highly correlated.A number of articles have focused on developing summary score measures that relate mixtures to disease outcomes.These methods focus on estimating a linear combination of the numerous mixture components and relating this combination to either a continuous or binary outcome [7,8].The focus of this article is on understanding the complex interactions between the mixture components, and we therefore compare methodologies where this is the goal.
We present the NHL case-control study in Sect. 2. In all subsequent sections, we describe the various methods followed by an analysis of these data using each of the approaches.In Sect.3, we review the Bayesian kernel machine regression (BKMR).Section 4 presents the broad class of shrinkage prior regression-based approaches including the recent methodology that incorporates a hierarchical constraint for interaction estimation.We also examine a novel approach to account for LOD using a multi-parameter per exposure formulation.Section 5 extends a recently developed latent class formulation [5] to the interaction setting.Finally, in Sect.6 we present a discussion of the results along with future next steps for methodological development.

NCI-SEER NHL Study
Studying the relationships between environmental and occupational exposure to chemicals and cancer risk remains an important area in cancer research (see IARC website).The NCI-SEER NHL study [9] is a population-based case-control study that was designed to determine the associations between chemical exposures (including pesticides and insecticides) found in used vacuum cleaner bags and the risk of NHL.Often chemicals enter the household from indoor use or drift in from outdoor and may persist for months and years in carpet and cushion furniture without being degraded by sunlight, rain, and extreme temperature.Hence, carpet dust sampling provides a more objective basis for exposure assessment as it contains integrated chemical exposure over a long period which is potentially more relevant to disease risk than recent or current exposure.In this study, the samples were collected from used vacuum cleaner bags of 672 cases and 508 control subjects in Detroit, Iowa, Los Angles, and Seattle and were analyzed for chemicals [9].Primarily the laboratory measurements contain missing data due to concentrations being below the LOD.The median percent of observations below the detection limit was 61% (across chemicals) with a range of (3% to 93%).In study analyses, multiple imputation was performed to "fill-in" exposure measurements that were below the LOD.This imputation was done by assuming that chemicals were log-normally distributed and that values below the LOD were in the tails of the distribution.Particularly for chemicals with a high percentage of values below their detection limits, results may not be robust to misspecification of the parametric assumptions.Thus, we consider alternative less modelbased approaches to account for LOD.
There were a few groups of chemicals where members within a group were highly correlated with each other (Correlation > 0.9).In this case, we randomly chose one member for each highly correlated pair in the analysis.Exposure data were logtransformed since measurements on the original scale were highly skewed.There were 26 chemicals exposures measured which are listed in the Appendix.After filtering out highly correlated chemicals, there was a total of 14 chemicals.Thus, the final dataset contained 14 chemical exposures on 1180 individuals (508 controls and 672 cases).Figure 1 shows the correlation between the 14 chemicals.We considered site, sex, education, and age as covariates [9] in all models for our data application.

Bayesian Kernel Machine Regression (BKMR)
A popular statistical method for analyzing chemical mixture data is the Bayesian kernel machine regression approach.In this approach, Bobb et al. [3] modeled non-linear and nonadditive relationships between exposure variables and outcome through a non-parametric kernel function.For a binary outcome Y i , the kernel machine regression is implemented through a probit link where Φ denotes the cumulative distribution function (CDF) of the standard normal distribution, ℎ ⋅ is the flexible function of p exposure variables X i1 , X i2 , …, X ip , and α defines the vector of regression coefficients for covariates U i .The function ℎ ⋅ is characterized as a Gaussian kernel function, where ℎ = ℎ 1 , ℎ 2 , . ., ℎ N ′ is multivariate normal with mean 0 and correlation given by cor ℎ i , ℎ i′ = exp τ∑ p = 1 P X ip − X i′p 2 for all pairs of individuals i and i′. ( where ϵ i N 0,1 .The formulation results in a probit link function when we dichotomize the latent variable at zero such that Y i = 1 when Y i * > 0 and 0 otherwise.We used the kmbayes function from the BKMR package to fit the model on the NHL data.Figure 2 shows the univariate exposure-response relationships for NHL with each chemical when the remaining chemicals are fixed at their median values.The plot suggests none of the chemicals have a sizeable effect on cancer risk. Two-way interactions among all pairs of exposures can be characterized by estimating the conditional distribution of the effect of one exposure given quantiles of the second exposure with the remaining chemicals fixed at their median value.Figure 3 shows the bi-variate exposure-response relationship derived from the BKMR analysis for a subset of pairwise comparisons.The fact that for each chemical, the conditional distributions are parallel for different quantiles of other chemicals suggests no evidence of interaction effects.We saw similar parallelism for all 91 interaction terms suggesting no interactions among the 14 chemicals (data not shown).

Bayesian Shrinkage Methods
Shrinkage priors in Bayesian estimation provide a useful way to estimate the higherorder interactions among mixture components.These approaches are analogs to penalized likelihood approaches that have been proposed in the frequentist context, and have the advantage in that they incorporate the penalization/shrinkage into the inference of the model parameters [10].
In this section, we compare various Bayesian shrinkage methods for estimating the interactions among components of chemical mixtures.We consider the following logistic regression model with linear effects consisting of p chemical exposures or main effects and p p − 1 /2 two-way interactions effects: where Y = Y 1 , Y 2 , ⋯, Y N ′ denotes the binary health response for N individuals, X i = X i1 , X i2 …, X ip ′ denotes p-dimensional continuous vector of main effects.We also denote logit a = log a 1 − a , U i = U i1 , U i2 …, U iq ′ as q-dimensional covariate vector including the intercept term, α* = α 1 , α 2 , …, α q ′ as the corresponding q-dimensional regression coefficient vector, β j * as the main effect regression coefficient of the j tℎ chemical, and γ jk * as the interaction effect regression coefficient of the j tℎ and k tℎ chemicals.
Following a latent variable approach [11], we approximate Eq. (3) using a robit link [12].Let ξ = ξ 1 , ξ 2 , …, ξ N ′ be a N-dimensional latent vector such that Y i = 1, if ξ i > 0 and 0 otherwise, where indexed by v, results if ϵ i follows a student t-distribution with v degrees of freedom [13], i.e., where β = β 1 , β 2 , …, β p and γ = γ 11 , γ 12 , …, γ p p − 1 /2 .Hence, the complete data likelihood is as follows: The main and interaction effects can be estimated by choosing a vague prior such that β j , γ jk N 0, 10 2 ; this is approximately a maximum likelihood approach.Incorporating a global-local shrinkage parameter might be a good option as it gathers information from the data to determine the amount of shrinkage that needs to be incorporated.To that end, ( The shrinkage priors mentioned in Eq. ( 5) do not imply the hierarchical principle [14,15], where interactions are only considered when corresponding main effects are present.Recent work [1] considered including this hierarchical condition by incorporating the following prior distribution: The prior distribution in Eq. ( 6) follows the global-local prior specification of [16].In this formulation, the local shrinkage parameter controls the degree of shrinkage for each individual and the global shrinkage parameter controls the overall shrinkage.Here for the main effect regression coefficient β j , we consider a predictor-specific local shrinkage parameter η j that controls the deviation in the degree of shrinkage for each exposure variable and a global parameter a that controls the overall shrinkage of the main effects towards the origin.Similarly, for the interaction effect regression coefficient γ jk , the predictor-specific local shrinkage parameter for each interaction term θ jk controls the degree of shrinkage for each interaction term, while the global shrinkage parameter b controls the overall shrinkage.We define, η = η 1 , η 2 , …, η p ′, i.e., p-dimensional vector of local shrinkageparameters of main effect and θ = θ 1 , θ 2 , …, θ p p − 1 /2 ′ the local shrinkage parameters for interaction effects.As a prior choice for both η j 's and θ jk 's, we consider a heavy tail distribution G 1, 1 distribution with mean and variance 1 to avoid overshrinking issues and incorporate variability.The larger values of η j 's and θ jk 's will induce more shrinkage towards zero for the corresponding main effects and interaction effects, respectively, while smaller values indicate less shrinkage to zero.For the global shrinkage parameters a and b, we consider G 1, 1 distribution as a prior choice to incorporate substantial mass near the origin.Finally, we considered a vague prior for α MVN 0, 10 2 I q , where I q defines q tℎ -order identity matrix.
The main objective of the shared shrinkage model is to incorporate a link between the main effects and the interaction effects.To that end, Kundu et al. [1] share the information between the j tℎ main effect and the j, k tℎ interaction effects through the local parameters η j and η k .We control the prior variance of γ jk by the term η j η k , such that γ jk will shrink to zero if at least one of the corresponding main effects β j or β k is small, i.e., their corresponding local shrinkage parameters η j or η k is large or the local shrinkage parameter of the interaction term θ jk is large itself.Similarly, if the main effects are sizeable, i.e., their corresponding η j 's and η k 's are small, that will induce less shrinkage for the corresponding interaction term γ jk .
Figure 4 shows a comparison of estimated interactions for the NHL study.For all interaction terms on the three models, the 95% HPD interval for γ jk contains zero, suggesting that there is no evidence for any two-way interaction among the components of the mixture.The order of the magnitude of the interval lengths from largest to smallest is the vague, independent shrinkage, and the shared shrinkage prior, respectively, demonstrating the efficiency advantages of incorporating a shrinkage prior along with exploiting the hierarchical assumption into parameter estimation.
As an additional sensitivity analysis, we also examined other shrinkage priors including a ridge [17], Lasso [18], and horseshoe [19] prior.Under a linear exposure (each exposure contributes a single linear term) model, estimation with these shrinkage priors can be done directly with the R package bayesreg.However, we emphasize that these methods do not incorporate any hierarchical structure.Figure 5 shows the comparison of interaction estimation using different shrinkage priors.As we have 91 interaction terms and are comparing six different methods, we illustrate the results with the estimation of two interaction terms.Although all intervals obtained with all six methods contain zero, the shared shrinkage prior approach has the narrowest interval, and is therefore most efficient.
So far, we described the modeling of a linear exposure and outcome relationship.In practice, exposure may be non-linear requiring more than one regression parameter for each exposure.Kundu et al. [1] extended their methodology to capture those non-linear exposure-outcome relationships using the following logistic regression model: We use a polynomial representation to model the non-linear exposure effect of each chemical.These polynomial effects are incorporated in the main and interaction effects by using Eq. ( 7) with functions g j X ij = X ij ′ β j and f jk X ij , X ik = Z jk ′ γ jk and the following logistic regression: where and the regression coefficients Most interesting for our application is to incorporate exposures that are subject to LOD in a robust manner.We incorporate a two-parameter per exposure model that was recently discussed for univariate exposure relationships Chiou et al. [20] and Ortega-Villa et al. [21].In this formulation, (i) the first component indicates whether the exposure for a single chemical is above the detection limit and ii) the second part shows the value of the exposure effect if it is above the detection limit.This parameterization allows a flexible modeling approach in spite of treating lower LOD as left censored.Hence, Kundu et al. [1] represent the extension of their work using Eq. ( 7) as follows: where β j1 defines the log odds of disease at the value of the detection limit relative to the log odds of disease below the detection limit, β j2 defines the log odds ratio of disease for a one-unit change in exposure above the detection limit.The interactive effects are measured using the parameter vector γ jk .Here, γ jk1 represents the interactive effect when both the j tℎ and k tℎ chemicals are above the detection limit, γ jk4 represents the interactive effect of increasing X ij and/or X ik when both markers are above the detection limit, and γ jk2 , γ jk3 are cross-product interaction effects.
Using two parameters per exposure model, Fig. 6 shows that we found multiple interaction effects, some of which demonstrated positive synergy between chemicals and others showing a negative interaction.The fact that the results are different as compared with the imputation approach suggests that imputing based on a parametric normal model may be problematic.

A Latent Functional Approach
To incorporate non-linear exposure risk relationships in a binary regression setting, Kim et al. [5] proposed the latent functions approach, where the individual effects for each exposure in a risk model can be written as the sum of unobserved functions.They showed that the relationship between chemical exposures and risk becomes more flexible as the number of latent functions increases, and complex exposure relationships can represented with only a few such functions.In this article, we extend the methodology to allow for a separate set of latent classes for the main and interaction effects, respectively.
As in Sect.4, let Y i be a binary outcome for the i tℎ individual.Also, let U i = U i1 , …, U iq ′ denote a q-dimensional vector of covariates for the i tℎ individual and α = α 1 , …, α q ′ denote the vector of regression coefficients corresponding to the q covariates.Furthermore, let X ij for main effects denote the chemical exposure for the j tℎ chemical on the i tℎ individual, j = 1, …, p, and Z ik = X ij 1 X ij 2 for two-way interactions, j 2 = j 1 + 1, …, p, j 1 = 1, …, p, k = 1, …, K with K = p p − 1 /2, and i = 1, …, N. Similar to Kim et al. [5], we use a binary regression model with interactions based on a finite number of non-linear functions using latent variable approach of Albert and Chib [11] as follows: where f l X ij is a functional form of X ij for the l tℎ latent class, g j is a latent membership indicator with P g j = l = ω l , L is a fixed number of latent classes 1 ≤ L ≤ p , s m Z ik is a functional form of Z ik for the m tℎ latent class, ℎ k is a latent membership indicator with P ℎ k = m = Ψ m , M is a fixed number of latent classes (m ≤ M ≤ K , and ϵ i follows a t-distribution with indexed by the degrees of freedom v = 7.Note that the indicator function 1 A is defined as 1 A = 1 if A is true and 0 otherwise.In this paper, we assume a polynomial regression function of order c to capture the non-linear structure for f l X ij and s m Z ik in Eq. (10) as f l X ij = β l1 X ij + ⋯ + β lc X ij c ≡ X ij * ′ β l and s m Z ik = γ m1 Z ik + ⋯ + γ mc Z ik c ≡ Z ik * ′ γ m , where c ′ , and γ m = γ m1 , …, γ mc ′.The latent variable ξ i in Eq. ( 10) can be rewritten as where δ j 1 ℎ k = m γ m , corresponding to regression coefficients for the j tℎ main effect and the k tℎ interaction term, respectively.The similar prior distributions and MCMC algorithm in Kim et al. (2023) are used in the analysis.To help obtain numerical stability in the implementation of the MCMC sampling algorithm, we standardized all of covariates by subtracting their sample means and then dividing by their sample SDs.All variables for main effects and interactions are standardized by dividing by its maximum value.
We assumed a cubic polynomial regression function for f l x ij and s m Z ik in Eq. ( 10) to incorporate a flexible functional form c = 3 in this paper.We considered models with various L and M to choose the number of latent classes to characterize the simultaneous effects of all chemicals on cancer risk.Table 1 shows the estimated posterior probabilities for ω l and Ψ m for the model with L = 5 and M = 5, demonstrating that the posterior probabilities of ω l and Ψ m for L > 3 and M > 3 were almost zero, suggesting many latent profiles are not needed.
Figure 7 shows the estimated log relative risks for the individual functional relationships and the corresponding 95% HPD intervals for 14 main effects and 91 interaction terms under the model with L = 5 and M = 5, respectively, showing that 95% HPD intervals include zero line and none of the main effects and interaction terms have relationship with NHL.

Discussion
Recently, there have numerous statistical approaches proposed in the statistics literature for studying the interactions among chemical mixture components.These approaches perform well under a correct model specification.However, there have been few comparisons of these methodologies on actual study data.This paper compares numerous recently developed approaches to a case-control study of NHL that examined the effects of multiple pollutants on cancer risk.
A challenging analytic issue in the analysis was the high proportion of LOD among chemicals.The original analyses of the study [9] used a simple imputation method for imputing values below the LOD.Using these imputations, we found that all the methods showed similar interaction effect estimates that were consistent with zero.Although we only used one realization from the imputation model for all analyses, we obtained nearly identical estimations using other realizations (data not shown).
Recognizing that the imputation approaches make strong assumptions on the distributions below the LOD, we conducted an additional analysis where each chemical exposure was represented by two parameters; one parameter for being above the LOD and the second for the slope when above this limit.Our two-parameter per exposure model does not require those strong assumptions.These analyses focused on the shrinkage estimation since this class of models can more easily be extended in a flexible way.Many interactions were identified with this formulation.In part, this can be explained if the imputation methods, which are difficult to validate, are inadequate (see Ortega-Villa et al. [21] for a simulation study with one exposure measurement).These results motivate the future methodological extending approaches such as BKMR and the latent functional approach to more flexibly incorporate LOD.
The different methods had different assumptions about the linearity of the exposure effects.BKMR introduces flexible relationships by the specification of the kernel function.However, it is not totally transparent what explicit assumptions are made on the linearity by specifying a particular kernel function.The latent functions approach explicitly assumes a polynomial assumption on the exposure relationships.
Each of the proposed methods used the scaled absolute exposure values in the analyses.
We also applied all the methods to percentiles of the exposure values rather than the absolute measurements.We were able to fit all methods with the exception of BKMR for these transformed exposure values.We were unable to come up with a reason for the computational failure of BKMR in this situation.However, for all other methods, we obtained similar inferences to those obtained with the absolute values (data not shown).
The methodology comparison focused on analyses from a case-control study.All the methods except BKMR have a direct relative-risk interpretation since we incorporated a logit link function and the NHL is a rare disease.The interpretation for BKMR is less clear since this approach uses a probit link function to relate the mixture components to cancer risk.The methodology and comparisons naturally apply to cohort studies with binary outcomes.Extensions to survival and longitudinal outcomes are areas for future research.

24.
Benz-A-Anthracene   Univariate exposure-response estimation Bivariate exposure-response estimation   Comparisons between randomly chosen slope vs slope γ jk4 interaction effects

Comparison of Interaction effects with Diazinon from with different prior choices
D; D24 chemical in the figure indicates the chemical 2,4-D.

Fig. 7 .
Fig. 7. Plots of the estimated log relative risks (relative to no exposure) as a function of chemical exposure with the posterior mean and 95% HPD intervals under the model with L = 5 and M = 5: a main effects; b interaction effects

Table 1
Posterior probability of ω l and Ψ m for model with L = 5 and M = 5 Stat Biosci.Author manuscript; available in PMC 2024 September 04.