1 Introduction

In environmental epidemiology, interest often focuses on estimating the complex associations between environmental chemical mixtures and disease risk. Recently, various approaches have focused on characterizing higher-order interactions between mixture components and outcomes including regression-based [1, 2], machine kernel regression [3], and latent class modeling approaches [4,5,6]. Although these methods have been illustrated with actual chemical mixture data, there have been few papers that have compared the various approaches on an actual dataset. This article investigates the different modeling strategies using case–control study data examining the effects of chemical exposures on non-Hodgkin’s Lymphoma (NHL).

There are analytic issues that make comparisons interesting. First, exposures can be non-linear making inferences about interactions more complex. Second, many of the chemical exposure measurements were below the lower limit of detection (LOD). Third, some of the chemicals were highly correlated. A number of articles have focused on developing summary score measures that relate mixtures to disease outcomes. These methods focus on estimating a linear combination of the numerous mixture components and relating this combination to either a continuous or binary outcome [7, 8]. The focus of this article is on understanding the complex interactions between the mixture components, and we therefore compare methodologies where this is the goal.

We present the NHL case–control study in Sect. 2. In all subsequent sections, we describe the various methods followed by an analysis of these data using each of the approaches. In Sect. 3, we review the Bayesian kernel machine regression (BKMR). Section 4 presents the broad class of shrinkage prior regression-based approaches including the recent methodology that incorporates a hierarchical constraint for interaction estimation. We also examine a novel approach to account for LOD using a multi-parameter per exposure formulation. Section 5 extends a recently developed latent class formulation [5] to the interaction setting. Finally, in Sect. 6 we present a discussion of the results along with future next steps for methodological development.

2 NCI-SEER NHL Study

Studying the relationships between environmental and occupational exposure to chemicals and cancer risk remains an important area in cancer research (see IARC website). The NCI-SEER NHL study [9] is a population-based case–control study that was designed to determine the associations between chemical exposures (including pesticides and insecticides) found in used vacuum cleaner bags and the risk of NHL. Often chemicals enter the household from indoor use or drift in from outdoor and may persist for months and years in carpet and cushion furniture without being degraded by sunlight, rain, and extreme temperature. Hence, carpet dust sampling provides a more objective basis for exposure assessment as it contains integrated chemical exposure over a long period which is potentially more relevant to disease risk than recent or current exposure. In this study, the samples were collected from used vacuum cleaner bags of 672 cases and 508 control subjects in Detroit, Iowa, Los Angles, and Seattle and were analyzed for chemicals [9]. Primarily the laboratory measurements contain missing data due to concentrations being below the LOD. The median percent of observations below the detection limit was 61% (across chemicals) with a range of (3% to 93%). In study analyses, multiple imputation was performed to “fill-in” exposure measurements that were below the LOD. This imputation was done by assuming that chemicals were log-normally distributed and that values below the LOD were in the tails of the distribution. Particularly for chemicals with a high percentage of values below their detection limits, results may not be robust to misspecification of the parametric assumptions. Thus, we consider alternative less model-based approaches to account for LOD.

There were a few groups of chemicals where members within a group were highly correlated with each other (Correlation > 0.9). In this case, we randomly chose one member for each highly correlated pair in the analysis. Exposure data were log-transformed since measurements on the original scale were highly skewed. There were 26 chemicals exposures measured which are listed in the Appendix. After filtering out highly correlated chemicals, there was a total of 14 chemicals. Thus, the final dataset contained 14 chemical exposures on 1180 individuals (508 controls and 672 cases). Figure 1 shows the correlation between the 14 chemicals. We considered site, sex, education, and age as covariates [9] in all models for our data application.

Fig. 1
figure 1

Correlation plot for chemical exposures

3 Bayesian Kernel Machine Regression (BKMR)

A popular statistical method for analyzing chemical mixture data is the Bayesian kernel machine regression approach. In this approach, Bobb et al. [3] modeled non-linear and non-additive relationships between exposure variables and outcome through a non-parametric kernel function. For a binary outcome \(Y_i\), the kernel machine regression is implemented through a probit link

$$\begin{aligned} \Phi ^{-1}(P(Y_i=1))=h(X_{i1}, X_{i2}, \ldots , X_{ip})+\varvec{ U_i' \alpha }, \end{aligned}$$
(1)

where \(\Phi\) denotes the cumulative distribution function (CDF) of the standard normal distribution, \(h(\cdot )\) is the flexible function of p exposure variables \(X_{i1}, X_{i2}, \ldots , X_{ip}\) , and \(\varvec{\alpha }\) defines the vector of regression coefficients for covariates \(\varvec{U_i}\). The function \(h(\cdot )\) is characterized as a Gaussian kernel function, where \(h=(h_1,h_2,..,h_N)'\) is multivariate normal with mean \(\varvec{0}\) and correlation given by \(\text {cor}(h_i,h_{i'})=\text {exp}(\tau \sum _{p=1}^P(X_{ip}-X_{i'p})^2)\) for all pairs of individuals i and \(i'\). Further, they model the latent variable \(Y_i^*\) (in Eq. (1)) as

$$\begin{aligned} Y^*_i=h(X_{i1}, X_{i2}, \ldots , X_{ip})+\varvec{ U_i' \alpha } +\epsilon _i, \quad i=1,2,\ldots ,N, \end{aligned}$$
(2)

where \(\epsilon _i \sim N(0,1)\). The formulation results in a probit link function when we dichotomize the latent variable at zero such that \(Y_i=1\) when \(Y_i^* > 0\) and 0 otherwise. We used the kmbayes function from the BKMR package to fit the model on the NHL data. Figure 2 shows the univariate exposure–response relationships for NHL with each chemical when the remaining chemicals are fixed at their median values. The plot suggests none of the chemicals have a sizeable effect on cancer risk.

Two-way interactions among all pairs of exposures can be characterized by estimating the conditional distribution of the effect of one exposure given quantiles of the second exposure with the remaining chemicals fixed at their median value. Figure 3 shows the bi-variate exposure–response relationship derived from the BKMR analysis for a subset of pairwise comparisons. The fact that for each chemical, the conditional distributions are parallel for different quantiles of other chemicals suggests no evidence of interaction effects. We saw similar parallelism for all 91 interaction terms suggesting no interactions among the 14 chemicals (data not shown).

Fig. 2
figure 2

Univariate exposure–response estimation

Fig. 3
figure 3

Bivariate exposure–response estimation

4 Bayesian Shrinkage Methods

Shrinkage priors in Bayesian estimation provide a useful way to estimate the higher-order interactions among mixture components. These approaches are analogs to penalized likelihood approaches that have been proposed in the frequentist context, and have the advantage in that they incorporate the penalization/shrinkage into the inference of the model parameters [10].

In this section, we compare various Bayesian shrinkage methods for estimating the interactions among components of chemical mixtures. We consider the following logistic regression model with linear effects consisting of p chemical exposures or main effects and \(p (p-1)/2\) two-way interactions effects:

$$\begin{aligned} \text {logit} P\left( Y_i=1|\varvec{X_i, U_i}\right) =\varvec{U_i}'\varvec{\alpha ^*}+\sum _{j=1}^p X_{ij}\beta ^*_j +\sum _{j=1}^p \sum _{k=j+1}^{p-1} X_{ij} X_{ik} \gamma ^*_{jk} , \quad i=1,2,\ldots ,N, \end{aligned}$$
(3)

where \(Y=\left( Y_1, Y_2, \cdots , Y_N\right) '\) denotes the binary health response for N individuals, \(\varvec{X_i}=\left( X_{i1},X_{i2}.\ldots ,X_{ip}\right) '\) denotes p-dimensional continuous vector of main effects. We also denote \(\text {logit}\) \(a=\text {log}\frac{a}{1-a}\), \(\varvec{U_i}=\left( U_{i1},U_{i2}.\ldots ,U_{iq}\right) '\) as q-dimensional covariate vector including the intercept term, \(\varvec{\alpha ^*}=\left( \alpha _{1},\alpha _{2},\ldots ,\alpha _{q}\right) '\) as the corresponding q-dimensional regression coefficient vector, \(\beta ^*_j\) as the main effect regression coefficient of the \(j^{th}\) chemical, and \(\gamma ^*_{jk}\) as the interaction effect regression coefficient of the \(j^{th}\) and \(k^{th}\) chemicals.

Following a latent variable approach [11], we approximate Eq. (3) using a robit link [12]. Let \(\varvec{\xi }=\left( \xi _1, \xi _2, \ldots , \xi _N\right) '\) be a N-dimensional latent vector such that \(Y_i = 1,\) if \(\xi _i > 0\) and 0 otherwise, where \(\xi _i = \varvec{U_i}'\varvec{\alpha ^*}+\sum _{j=1}^p X_{ij}\beta _j^* +\sum _{j=1}^p \sum _{k=j+1}^{p-1} X_{ij} X_{ik} \gamma _{jk}^* +\epsilon _i\). The robit link function, indexed by v, results if \(\epsilon _i\) follows a student t-distribution with v degrees of freedom [13], i.e., \(P\left( Y_i=1|\varvec{\alpha ^*,\beta ^*,\gamma ^*}\right) =F_{t_v} \left( \varvec{U_i}'\varvec{\alpha ^*}+\sum _{j=1}^p X_{ij}\beta _j^* +\sum _{j=1}^p \sum _{k=j+1}^{p-1} X_{ij} X_{ik} \gamma _{jk}^*\right)\) , where \(\beta ^*=\left( \beta _{1}^*,\beta _{2}^*,\ldots ,\beta _{p}^*\right) '\) and \(\gamma ^*=\left( \gamma _{11}^*,\gamma _{12}^*,\ldots ,\gamma _{p(p-1)/2}^*\right) '.\) As \(v \rightarrow \infty\), the robit(v) model becomes the probit regression model. Liu [12] suggested that the robit link with \(v=7\) degrees of freedom closely approximates the logit link with \(\alpha _l = \alpha ^*_j/1.5484\), \(\beta _j = \beta ^*_j/1.5484\) , and \(\gamma _{jk} = \gamma _{jk}^*/1.5484\). Moreover, we use the fact that the t-distribution can be represented as a scale mixture of normal distribution by introducing a mixing variable \(\lambda _i\), such that \(\epsilon _i|\lambda _i \sim \text {N}\left( 0,\frac{1}{\lambda _i}\right)\) and \(\lambda _i \sim \text {G}\left( \frac{v}{2},\frac{v}{2}\right)\), where \(\text {N}(\mu ,\sigma ^2)\) denotes a normal distribution with mean \(\mu\) and variance \(\sigma ^2\) and \(\text {G}(c_1,c_2)\) denotes the gamma distribution with mean \(c_1/c_2\) and variance \(c_1/c_2^2\) to formulate the likelihood. We define the interactions of two exposure variables \(X_{ij}\) and \(X_{ik}\) for the \(i^{th}\) individual as \(Z_{i_{jk}}=X_{ij} X_{ik}\) and \(\varvec{Z_i}=\left( Z_{i_{11}},Z_{i_{12}}, \cdots ,Z_{i_{p(p-1)/2}} \right) '\). Hence, \(\xi _i|\lambda _i \sim \text {N}\left( \varvec{U_i}'\varvec{\alpha }+\varvec{X_i' \beta }+\varvec{Z_i' \gamma },\frac{1}{\lambda _i}\right)\) and \(\lambda _i \sim \text {G}\left( \frac{v}{2},\frac{v}{2}\right)\) , where \(\varvec{\beta }=\left( \beta _{1},\beta _{2},\ldots ,\beta _{p}\right)\) and \(\varvec{\gamma }=\left( \gamma _{11},\gamma _{12},\ldots ,\gamma _{p(p-1)/2}\right)\). Hence, the complete data likelihood is as follows:

$$\begin{aligned} \pi (Y|X)&= \prod _{i=1}^N \left[ Y_i 1 \left\{ \xi _i > 0 \right\} +(1-Y_i ) 1 \left\{ \xi _i <= 0 \right\} \right] \nonumber \\&\times \left( 2\pi \right) ^{-\frac{1}{2}} \lambda _i^{\frac{1}{2}} exp \left( -\frac{\lambda _i}{2} \left( \xi _i-\varvec{U_i' \alpha }-\varvec{X_i' \beta }- \varvec{Z_i' \gamma } \right) ^2 \right) \nonumber \\&\times \frac{\left( \dfrac{\nu }{2}\right) ^{\dfrac{\nu }{2}}}{\Gamma \left( \dfrac{\nu }{2}\right) } \lambda _i ^{\frac{v}{2}-1} exp\left( -\frac{\lambda _i v}{2} \right) . \end{aligned}$$
(4)

The main and interaction effects can be estimated by choosing a vague prior such that \(\beta _j, \gamma _{jk} \sim \text {N}\left( 0,10^2\right)\); this is approximately a maximum likelihood approach. Incorporating a global–local shrinkage parameter might be a good option as it gathers information from the data to determine the amount of shrinkage that needs to be incorporated. To that end,

$$\begin{aligned} \beta _j \sim \text {N}\left( 0, \frac{1}{a\eta _j}\right) , \quad \gamma _{jk} \sim \text {N}\left( 0, \frac{1}{b \theta _{jk}}\right) . \end{aligned}$$
(5)

The shrinkage priors mentioned in Eq. (5) do not imply the hierarchical principle [14, 15], where interactions are only considered when corresponding main effects are present. Recent work [1] considered including this hierarchical condition by incorporating the following prior distribution:

$$\begin{aligned} \beta _j&\sim \text {N}\left( 0, \frac{1}{a\eta _j}\right) \, \, , \gamma _{jk} \sim \text {N}\left( 0, \frac{1}{b \eta _j \eta _k \theta _{jk}}\right) , \nonumber \\ \eta _j&\sim \text {G} (1,1) \, \, , \theta _{jk} \sim \text {G} (1,1). \end{aligned}$$
(6)

The prior distribution in Eq. (6) follows the global–local prior specification of [16]. In this formulation, the local shrinkage parameter controls the degree of shrinkage for each individual and the global shrinkage parameter controls the overall shrinkage. Here for the main effect regression coefficient \(\beta _j\), we consider a predictor-specific local shrinkage parameter \(\eta _j\) that controls the deviation in the degree of shrinkage for each exposure variable and a global parameter a that controls the overall shrinkage of the main effects towards the origin. Similarly, for the interaction effect regression coefficient \(\gamma _{jk}\), the predictor-specific local shrinkage parameter for each interaction term \(\theta _{jk}\) controls the degree of shrinkage for each interaction term, while the global shrinkage parameter b controls the overall shrinkage. We define, \(\varvec{\eta }=\left( \eta _1,\eta _2,\ldots ,\eta _p\right) '\), i.e., p-dimensional vector of local shrinkage parameters of main effect and \(\varvec{\theta }=\left( \theta _1,\theta _2,\ldots ,\theta _{p(p-1)/2}\right) '\) the local shrinkage parameters for interaction effects. As a prior choice for both \(\eta _j\)’s and \(\theta _{jk}\)’s, we consider a heavy tail distribution \(\text {G} (1,1)\) distribution with mean and variance 1 to avoid overshrinking issues and incorporate variability. The larger values of \(\eta _j\)’s and \(\theta _{jk}\)’s will induce more shrinkage towards zero for the corresponding main effects and interaction effects, respectively, while smaller values indicate less shrinkage to zero. For the global shrinkage parameters a and b, we consider \(\text {G}(1,1)\) distribution as a prior choice to incorporate substantial mass near the origin. Finally, we considered a vague prior for \(\varvec{\alpha } \sim \text {MVN}(\varvec{0},10^2 \varvec{I_q}),\) where \(\varvec{I_q}\) defines \(q^{th}\)-order identity matrix.

The main objective of the shared shrinkage model is to incorporate a link between the main effects and the interaction effects. To that end, Kundu et al. [1] share the information between the \(j^{th}\) main effect and the \((j,k)^{th}\) interaction effects through the local parameters \(\eta _j\) and \(\eta _k\). We control the prior variance of \(\gamma _{jk}\) by the term \(\eta _j \eta _k\), such that \(\gamma _{jk}\) will shrink to zero if at least one of the corresponding main effects \(\beta _j\) or \(\beta _k\) is small, i.e., their corresponding local shrinkage parameters \(\eta _j\) or \(\eta _k\) is large or the local shrinkage parameter of the interaction term \(\theta _{jk}\) is large itself. Similarly, if the main effects are sizeable, i.e., their corresponding \(\eta _j\)’s and \(\eta _k\)’s are small, that will induce less shrinkage for the corresponding interaction term \(\gamma _{jk}\).

Figure 4 shows a comparison of estimated interactions for the NHL study. For all interaction terms on the three models, the 95% HPD interval for \(\gamma _{jk}\) contains zero, suggesting that there is no evidence for any two-way interaction among the components of the mixture. The order of the magnitude of the interval lengths from largest to smallest is the vague, independent shrinkage, and the shared shrinkage prior, respectively, demonstrating the efficiency advantages of incorporating a shrinkage prior along with exploiting the hierarchical assumption into parameter estimation.

Fig. 4
figure 4

Comparison of Interaction effects with Diazinon with different model

As an additional sensitivity analysis, we also examined other shrinkage priors including a ridge [17], Lasso [18], and horseshoe [19] prior. Under a linear exposure (each exposure contributes a single linear term) model, estimation with these shrinkage priors can be done directly with the R package bayesreg. However, we emphasize that these methods do not incorporate any hierarchical structure. Figure 5 shows the comparison of interaction estimation using different shrinkage priors. As we have 91 interaction terms and are comparing six different methods, we illustrate the results with the estimation of two interaction terms. Although all intervals obtained with all six methods contain zero, the shared shrinkage prior approach has the narrowest interval, and is therefore most efficient.

Fig. 5
figure 5

Comparison of Interaction effects with Diazinon from with different prior choices

So far, we described the modeling of a linear exposure and outcome relationship. In practice, exposure may be non-linear requiring more than one regression parameter for each exposure. Kundu et al. [1] extended their methodology to capture those non-linear exposure-outcome relationships using the following logistic regression model:

$$\begin{aligned} \text {logit} P\left( Y_i=1|X_{ij}, \varvec{U_i}\right) =\varvec{U_i'\alpha }+\sum _{j=1}^p g_j\left( X_{ij}\right) +\sum _{j=1}^p \sum _{k=j+1}^{p-1} f_{jk} \left( X_{ij} ,X_{ik}\right) . \end{aligned}$$
(7)

We use a polynomial representation to model the non-linear exposure effect of each chemical. These polynomial effects are incorporated in the main and interaction effects by using Eq. (7) with functions \(g_j\left( X_{ij}\right)\)=\(\varvec{X_{ij}'\beta _{j}}\) and \(f_{jk} \left( X_{ij},X_{ik}\right) =\varvec{Z_{jk}' \gamma _{jk}}\) and the following logistic regression:

$$\begin{aligned} \text {logit} P\left( Y_i=1|\varvec{X_i,Z_i, U_i}\right) =\varvec{U_i'\alpha }+\sum _{j=1}^p\varvec{ X_{ij}'\beta _{j}} +\sum _{j=1}^p \sum _{k=j+1}^{p-1} \varvec{Z_{jk}' \gamma _{jk}} , \quad i=1,2,\ldots ,N, \end{aligned}$$
(8)

where \(\varvec{X_{ij}}=\left( X_{ij}, X_{ij}^2\right) '\) and \(\varvec{Z_{jk}}=\left( X_{ij}X_{ik}, X_{ij}^2X_{ik}, X_{ij}X_{ik}^2, X_{ij}^2X_{ik}^2\right) '\) and the regression coefficients \(\varvec{\beta _{j}}=\left( \beta _{j1},\beta _{j2}\right) '\) and \(\varvec{\gamma _{jk}}= \left( \gamma _{jk1},\gamma _{jk2},\gamma _{jk3},\gamma _{jk4}\right) '\).

Most interesting for our application is to incorporate exposures that are subject to LOD in a robust manner. We incorporate a two-parameter per exposure model that was recently discussed for univariate exposure relationships Chiou et al. [20] and Ortega-Villa et al. [21]. In this formulation, (i) the first component indicates whether the exposure for a single chemical is above the detection limit and ii) the second part shows the value of the exposure effect if it is above the detection limit. This parameterization allows a flexible modeling approach in spite of treating lower LOD as left censored. Hence, Kundu et al. [1] represent the extension of their work using Eq. (7) as follows:

$$\begin{aligned} g_j\left( X_{ij}\right)&=\beta _{j1}I\left( X_{ij} \ge C_j \right) +\beta _{j2}I\left( X_{ij} \ge C_j \right) \left( X_{ij} -C_{j}\right) \nonumber \\ f_{jk} \left( X_{ij} ,X_{ik}\right)&=\gamma _{jk1}I\left( X_{ij} \ge C_j \right) I\left( X_{ik} \ge C_k \right) +\gamma _{jk2}\left( X_{ij} -C_{j}\right) I\left( X_{ij} \ge C_j \right) I\left( X_{ik} \ge C_k \right) \nonumber \\&+ \gamma _{jk3}\left( X_{ik} -C_{k}\right) I\left( X_{ij} \ge C_j \right) I\left( X_{ik} \ge C_k \right) \nonumber \\&+\gamma _{jk4}\left( X_{ij} -C_{j}\right) \left( X_{ij} -C_{k}\right) I\left( X_{ij} \ge C_j \right) I\left( X_{ik} \ge C_k \right) , \end{aligned}$$
(9)

where \(\beta _{j1}\) defines the log odds of disease at the value of the detection limit relative to the log odds of disease below the detection limit, \(\beta _{j2}\) defines the log odds ratio of disease for a one-unit change in exposure above the detection limit. The interactive effects are measured using the parameter vector \(\gamma _{jk.}\). Here, \(\gamma _{jk1}\) represents the interactive effect when both the \(j^{th}\) and \(k^{th}\) chemicals are above the detection limit, \(\gamma _{jk4}\) represents the interactive effect of increasing \(X_{ij}\) and/or \(X_{ik}\) when both markers are above the detection limit, and \(\gamma _{jk2}\), \(\gamma _{jk3}\) are cross-product interaction effects.

Using two parameters per exposure model, Fig. 6 shows that we found multiple interaction effects, some of which demonstrated positive synergy between chemicals and others showing a negative interaction. The fact that the results are different as compared with the imputation approach suggests that imputing based on a parametric normal model may be problematic.

Fig. 6
figure 6

Comparisons between randomly chosen slope vs slope (\(\gamma _{jk4}\)) interaction effects

5 A Latent Functional Approach

To incorporate non-linear exposure risk relationships in a binary regression setting, Kim et al. [5] proposed the latent functions approach, where the individual effects for each exposure in a risk model can be written as the sum of unobserved functions. They showed that the relationship between chemical exposures and risk becomes more flexible as the number of latent functions increases, and complex exposure relationships can represented with only a few such functions. In this article, we extend the methodology to allow for a separate set of latent classes for the main and interaction effects, respectively.

As in Sect. 4, let \(Y_{i}\) be a binary outcome for the \(i^{th}\) individual. Also, let \(\varvec{U}_i = \left( U_{i1}, \dots , U_{iq} \right) '\) denote a q-dimensional vector of covariates for the \(i^{th}\) individual and \(\varvec{\alpha }= \left( \alpha _1, \dots , \alpha _q \right) '\) denote the vector of regression coefficients corresponding to the q covariates. Furthermore, let \(X_{ij}\) for main effects denote the chemical exposure for the \(j^{th}\) chemical on the \(i^{th}\) individual, \(j = 1, \dots , p\), and \(Z_{ik} = X_{ij_1} X_{ij_2}\) for two-way interactions, \(j_2 = j_1 + 1, \dots , p\), \(j_1 = 1, \dots , p\), \(k = 1, \dots , K\) with \(K = p(p-1)/2\), and \(i = 1, \dots , N\). Similar to Kim et al. [5] , we use a binary regression model with interactions based on a finite number of non-linear functions using latent variable approach of Albert and Chib [11] as follows:

$$\begin{aligned}&Y_i = {\left\{ \begin{array}{ll} 1 &{} \text{ if } \xi _{i} \ge 0 \\ 0 &{} \text{ if } \xi _{i} < 0 \end{array}\right. } \;\; \text{ and } \nonumber \\&\xi _{i} = \varvec{U}_i'\varvec{\alpha }+ \sum _{j=1}^p \sum _{l=1}^{L} 1(g_j = l) f_l(X_{ij}) + \sum _{k=1}^K \sum _{m=1}^{M} 1(h_k = m) s_m(Z_{ik}) + \epsilon _i, \end{aligned}$$
(10)

where \(f_l(X_{ij})\) is a functional form of \(X_{ij}\) for the \(l^{th}\) latent class, \(g_j\) is a latent membership indicator with \(P(g_j = l) = \omega _l\), L is a fixed number of latent classes \((1 \le L \le p)\), \(s_m(Z_{ik})\) is a functional form of \(Z_{ik}\) for the \(m^{th}\) latent class, \(h_k\) is a latent membership indicator with \(P(h_k = m) = \psi _m\), M is a fixed number of latent classes \((m \le M \le K)\), and \(\epsilon _i\) follows a t-distribution with indexed by the degrees of freedom \(\nu = 7\). Note that the indicator function \(1\{A\}\) is defined as \(1\{A\} = 1\) if A is true and 0 otherwise. In this paper, we assume a polynomial regression function of order c to capture the non-linear structure for \(f_l(X_{ij})\) and \(s_m(Z_{ik})\) in Eq. (10) as \(f_l(X_{ij}) = \beta _{l1}X_{ij} + \dots + \beta _{lc} X_{ij}^c \equiv {\varvec{X}_{ij}^*}' \varvec{\beta }_l\) and \(s_m(Z_{ik}) = \gamma _{m1}Z_{ik} + \dots + \gamma _{mc} Z_{ik}^c \equiv {\varvec{Z}_{ik}^*}' \varvec{\gamma }_m\), where \(\varvec{X}_{ij}^* = (X_{ij}, \dots , X_{ij}^c)'\), \(\varvec{\beta }_l = (\beta _{l1}, \dots , \beta _{lc})'\), \(\varvec{Z}_{ik}^* = (Z_{ik}, \dots , Z_{ik}^c)'\), and \(\varvec{\gamma }_m = (\gamma _{m1}, \dots , \gamma _{mc})'\). The latent variable \(\xi _{i}\) in Eq. (10) can be rewritten as

$$\begin{aligned} \xi _{i}&= \varvec{U}_i'\varvec{\alpha }+ \sum _{j=1}^p \sum _{l=1}^{L} 1(g_j = l) {\varvec{X}_{ij}^*}' \varvec{\beta }_l + \sum _{k=1}^K \sum _{m=1}^{M} 1(h_k = m) {\varvec{Z}_{ik}^*}' \varvec{\gamma }_m + \epsilon _i \nonumber \\&= \varvec{U}_i'\varvec{\alpha }+ \sum _{j=1}^p {\varvec{X}_{ij}^*}' \varvec{\delta }_j^x + \sum _{k=1}^K {\varvec{Z}_{ij}^*}' \varvec{\delta }_k^z + \epsilon _i, \end{aligned}$$
(11)

where \(\varvec{\delta }_j^x = \sum _{l=1}^{L} 1(g_j = l) \varvec{\beta }_l\) and \(\varvec{\delta }_k^z = \sum _{m=1}^{M} 1(h_k = m) \varvec{\gamma }_m\), corresponding to regression coefficients for the \(j^{th}\) main effect and the \(k^{th}\) interaction term, respectively. The similar prior distributions and MCMC algorithm in Kim et al. (2023) are used in the analysis. To help obtain numerical stability in the implementation of the MCMC sampling algorithm, we standardized all of covariates by subtracting their sample means and then dividing by their sample SDs. All variables for main effects and interactions are standardized by dividing by its maximum value.

We assumed a cubic polynomial regression function for \(f_l(x_{ij})\) and \(s_m(Z_{ik})\) in Eq. (10) to incorporate a flexible functional form (\(c = 3\)) in this paper. We considered models with various L and M to choose the number of latent classes to characterize the simultaneous effects of all chemicals on cancer risk. Table 1 shows the estimated posterior probabilities for \(\omega _l\) and \(\psi _m\) for the model with \(L = 5\) and \(M = 5\), demonstrating that the posterior probabilities of \(\omega _l\) and \(\psi _m\) for \(L > 3\) and \(M > 3\) were almost zero, suggesting many latent profiles are not needed.

Table 1 Posterior probability of \(\omega _l\) and \(\psi _m\) for model with \(L = 5\) and \(M = 5\)

Figure 7 shows the estimated log relative risks for the individual functional relationships and the corresponding 95% HPD intervals for 14 main effects and 91 interaction terms under the model with \(L=5\) and \(M=5\), respectively, showing that 95% HPD intervals include zero line and none of the main effects and interaction terms have relationship with NHL.

Fig. 7
figure 7

Plots of the estimated log relative risks (relative to no exposure) as a function of chemical exposure with the posterior mean and 95% HPD intervals under the model with \(L=5\) and \(M=5\): a main effects; b interaction effects

6 Discussion

Recently, there have numerous statistical approaches proposed in the statistics literature for studying the interactions among chemical mixture components. These approaches perform well under a correct model specification. However, there have been few comparisons of these methodologies on actual study data. This paper compares numerous recently developed approaches to a case–control study of NHL that examined the effects of multiple pollutants on cancer risk.

A challenging analytic issue in the analysis was the high proportion of LOD among chemicals. The original analyses of the study [9] used a simple imputation method for imputing values below the LOD. Using these imputations, we found that all the methods showed similar interaction effect estimates that were consistent with zero. Although we only used one realization from the imputation model for all analyses, we obtained nearly identical estimations using other realizations (data not shown).

Recognizing that the imputation approaches make strong assumptions on the distributions below the LOD, we conducted an additional analysis where each chemical exposure was represented by two parameters; one parameter for being above the LOD and the second for the slope when above this limit. Our two-parameter per exposure model does not require those strong assumptions. These analyses focused on the shrinkage estimation since this class of models can more easily be extended in a flexible way. Many interactions were identified with this formulation. In part, this can be explained if the imputation methods, which are difficult to validate, are inadequate (see Ortega-Villa et al. [21] for a simulation study with one exposure measurement). These results motivate the future methodological extending approaches such as BKMR and the latent functional approach to more flexibly incorporate LOD.

The different methods had different assumptions about the linearity of the exposure effects. BKMR introduces flexible relationships by the specification of the kernel function. However, it is not totally transparent what explicit assumptions are made on the linearity by specifying a particular kernel function. The latent functions approach explicitly assumes a polynomial assumption on the exposure relationships.

Each of the proposed methods used the scaled absolute exposure values in the analyses. We also applied all the methods to percentiles of the exposure values rather than the absolute measurements. We were able to fit all methods with the exception of BKMR for these transformed exposure values. We were unable to come up with a reason for the computational failure of BKMR in this situation. However, for all other methods, we obtained similar inferences to those obtained with the absolute values (data not shown).

The methodology comparison focused on analyses from a case–control study. All the methods except BKMR have a direct relative-risk interpretation since we incorporated a logit link function and the NHL is a rare disease. The interpretation for BKMR is less clear since this approach uses a probit link function to relate the mixture components to cancer risk. The methodology and comparisons naturally apply to cohort studies with binary outcomes. Extensions to survival and longitudinal outcomes are areas for future research.