Covariate-adjusted multiple testing in genome-wide association studies via factorial hidden Markov models

Abstract

It is more and more important to consider the dependence structure among multiple testings, especially for the genome-wide association studies (GWAS). The existing procedures, such as local index of significance (LIS) and pooled local index of significance (PLIS), were proposed to test hidden Markov model (HMM)-dependent hypotheses under the framework of compound decision theory, which was successfully applied to GWAS. However, the etiology of complex diseases is not only with respect to the genetic effects, but also the environmental factors. Failure to account for the covariates in multiple testing can produce misleading bias of the association of interest, or suffer from loss of testing efficiency. In this paper, we develop a covariate-adjusted multiple testing procedure, called covariate-adjusted local index of significance (CALIS), to account for the effects of environmental factors via a factorial hidden Markov model. The theoretical results show that our procedure can control the false discovery rate (FDR) at the nominal level and has the smallest false non-discovery rate (FNR) among all valid FDR procedures. We further demonstrate the advantage of our novel procedure over the existing procedures by simulation studies and a real data analysis.

Introduction

The problems of large-scale multiple testing arise from many scientific applications. For instance, in genome-wide association studies (GWAS), one needs to perform tens of thousands of tests to identify the single nucleotide polymorphisms (SNPs) associated with the complex disease. Other examples include neuroimaging data analysis (Shu et al. 2015), microarray data analysis (Liang and Nettleton 2010; Liang et al. 2018), spatial data analysis (Sun et al. 2015), etc. To date, a number of multiple testing procedures have been proposed to apply in various scientific fields. However, there are still some cumbersome issues left to cope with in large-scale multiple testing. First, the growing availability of high-throughput data requires us to conduct tens of thousands of tests simultaneously. The multiple testing procedures based on traditional control criteria, such as the family-wise error rate (FWER), are overly conservative and have small power when testing a large number of hypotheses. Second, the hypotheses in multiple testing often exhibit complex dependence in practice. For example, in gene category testing problems, one needs to test hundreds of null hypotheses that correspond to nodes in a gene ontology graph (Liang and Nettleton 2010; Liang et al. 2018). Ignoring the dependence structure among hypotheses may suffer from loss of testing efficiency. Finally, the statistics yielded by multiple tests may be affected by external covariates. In genetic association analyses, we usually need to evaluate the association between genetic factors and disease variables of interest by adjusting for covariate effects. Failure to account for covariate effects may lead to misleading bias of the association or suffer from loss of testing efficiency (Zhu et al. 2012).

To address the aforementioned issues, a collection of large-scale multiple testing approaches have been proposed. In a seminal paper, Benjamini and Hochberg (1995) proposed a novel control criterion referred to as false discovery rate (FDR) for multiple testing and reported that multiple testing procedures based on FDR tend to achieve more multiple testing efficiency. FDR is defined as the expected proportion of false rejections among all rejections. Correspondingly, false non-discovery rate (FNR; Genovese and Wasserman 2002) which is an alternative measure of Type II error is defined as the expected proportion of falsely accepted hypotheses. In general, we wish to develop a multiple testing procedure which controls FDR at a pre-specified level \(\alpha \) and has the smallest FNR among all FDR procedures at level \(\alpha \). The traditional p-value-based FDR procedures (Benjamini and Hochberg 1995, 2000; Storey 2002; Genovese and Wasserman 2002) are essentially trying to find a cut-off along the ranked p-values. Typically, these procedures assumed that the tests are independent. In reality, however, this assumption can rarely hold due to the presence of complex dependence structures among tests. It is worthy and challenging to develop a multiple testing procedure that can exploit the dependent information among hypotheses properly. Recently, Sun and Cai (2009) suggested to use the hidden Markov model (HMM) to characterize the local dependence among tests and proposed a novel multiple testing procedure, termed as the LIS procedure hereafter. Due to the fact that disease-associated SNPs tend to be clustered and dependent, the HMM which can correctly model the correlation among adjacent SNPs has been successfully applied in GWAS. Wei et al. (2009) assumed that the whole chromosome follows an HMM and extended the LIS procedure to allow for ranking LIS values across all chromosomes. Xiao et al. (2013) pointed out that different chromosome regions should follow different HMMs and suggested to use a modified LIS procedure based on region-specific HMM in GWAS. Some other detailed extensions of the LIS procedure can be found in Kuan and Chiang (2012), Shu et al. (2015) and Liu et al. (2016).

It is important to note that all these existing procedures do not take into account the covariate effects in multiple testing. It has been reported that the etiology of complex diseases (such as asthma, atherosclerosis, bipolar disorder and alcoholism) depend not only on the genetic effects, but also on covariates, such as the environmental factors (Wang et al. 2006; Jiang and Zhang 2011; Zhu et al. 2012). An appropriate covariate-adjustment in GWAS is crucial for identifying genetic effects of interest. So far, only a handful of methods have been proposed for multiple testing with covariate effects. Zablocki et al. (2014) suggested to leverage locus-specific covariates, such as functional annotations, to improve gene discovery in GWAS. Zablocki et al. (2017) further proposed a semi-parametric procedure for covariate-modulated multiple testing. In a recent research, Lei and Fithian (2018) proposed a multiple testing procedure which can adaptively select the p-value rejection threshold by using generic side information as covariates. However, little work has been done to address both the covariate and dependence issues. Thereby it is necessary to explore new multiple testing procedures that allow for the covariate-adjustment along with the dependent information among tests.

To adjust for the effects of covariates, we provide a factorial HMM for large-scale multiple testing. The factorial HMM was first named by Ghahramani and Jordan (1997) and is a generalization of the HMM in which the hidden state is factored into multiple state variables. Specifically, we factor each hidden state of HMM into two state variables, among which one indicates the state (null or non-null) of a hypothesis that we mainly focus on and the other one denotes the existence status (presence or absence) of covariate effects. Following the fundamental work of Sun and Cai (2009), we further assume that the existence status (presence or absence) of covariate effects follow a Markov chain. It is necessary to point out that the above assumption is natural and reasonable in many applications. For example, in GWAS, SNP-level functional annotations may affect the distribution of testing statistics of interest (Schork et al. 2013). Since the adjacent genomic loci tend to co-segregate in meiosis, it is reasonable to assume that the SNPs with similar functional annotations are clustered and locally dependent.

Essentially, most of the multiple testing procedures involve two steps: ranking the hypotheses based on some suitable multiple testing statistics such as p-values (Benjamini and Hochberg 1995); Lfdr values (local false discovery rate; Efron et al. 2001); LIS values (local index of significance; Sun and Cai 2009), and then choosing a cut-off along the rankings. Consequently, there are two fundamental problems to solve, namely, deriving an optimal multiple testing statistic based on the factorial HMM and then setting a suitable threshold. To address the foregoing problems, we take several steps as follows. First, we demonstrate that the optimal statistics in the weighted classification problems are equivalent to those in the multiple testing problems under the generalized monotone ratio condition (GMRC); Second, we derive the optimal statistics in the weighted classification problems and then define the corresponding multiple testing statistics (termed as covariate-adjusted local index of significance, CALIS); finally, with similar derivations in Sun and Cai (2009), Genovese and Wasserman (2004) and Newton et al. (2004), we can obtain a suitable cut-off along ranked CALIS values. It is necessary to note that based on the factorial HMM, the novel multiple testing procedure (called as the CALIS procedure hereafter) can not only leverage the dependent information among tests but also accommodate the covariate effects. Furthermore, both simulations and real data analysis illustrate that the CALIS procedure is valid and achieves more efficiency in multiple testing by adjusting for covariate effects.

The rest of this paper is organized as follows. In Sect. 2, we first give a brief review of the LIS procedure under an HMM (Sun and Cai 2009). Followed by their fundamental work, we introduce the framework of the covariate-adjusted multiple testing under the factorial HMM and illustrate the connection between the weighted classification problems and the covariate-adjusted multiple testing problems. Then, we propose optimal covariate-adjusted multiple testing procedures with the factorial HMM parameters that are known and unknown, respectively (i.e., the oracle CALIS procedure and data-driven CALIS procedure). The concrete theoretical results and detailed implementations of these procedures are discussed in the end. In Sect. 3, we carry out extensive simulations to evaluate the performance of our oracle and data-driven CALIS procedures. A real data analysis of the bipolar disorder is in Sect. 4. Finally, we conclude with some discussions and suggestions for future work in Sect. 5.

Statistical methods

The LIS procedure under an HMM

Suppose that there are m hypotheses of interest \(\{H_i\}^m_{i=1}\) to test simultaneously. Let \(\{\theta _i\}^m_{i=1}\) be the underlying states of the m hypotheses, where \(\theta _i=1\) implies hypothesis \(H_i\) belongs to the non-null case and \(\theta _i=0\) otherwise. Sun and Cai (2009) suggested to use an HMM to model the dependence structure among hypotheses and assumed that \(\{\theta _i\}^m_{i=1}\) follows a stationary, irreducible and aperiodic Markov chain. They further assumed that the observations \(\{z_i\}^m_{i=1}\) are conditionally independent given the hidden states \(\{\theta _i\}^m_{i=1}\), i.e.,

$$\begin{aligned} P(\{z_i\}^m_{i=1}|\{\theta _i\}^m_{i=1})=\prod ^m_{i=1}P(z_i|\theta _i), \end{aligned}$$

where the observed value \(z_i\) can be a z-value (Wei et al. 2009) or a test statistic (Liu et al. 2016) corresponding to the i-th hypothesis. The LIS is defined as

$$\begin{aligned} LIS_j=P_{\varphi }(\theta _j=0|\{z_i\}^m_{i=1}), \end{aligned}$$

where \(\varphi \) represents the parameters of the HMM. Denote by \(LIS_{(1)},LIS_{(2)},\) \(\dots ,\) \(LIS_{(m)}\) the ordered LIS values and \(H_{(1)},H_{(2)},\dots ,H_{(m)}\) the corresponding null hypotheses. Then the LIS procedure can be described as follows:

$$\begin{aligned} \mathrm{Let}~ l=\max \left\{ i:\frac{1}{i}\sum ^i_{j=1}LIS_{(j)}\le \alpha \right\} ;~ \mathrm{then~ reject~ all~} H_{(i)},~ \mathrm{for~} i=1,\dots ,l. \end{aligned}$$

It is worth noting that the LIS procedure implicitly assumed that the observations merely depend on the states of main effect hypotheses, namely, \(\{\theta _i\}^m_{i=1}\). In practice, however, the observations are usually affected by external covariates. To account for the covariate effects, we introduce a factorial HMM in the following section.

The covariate-adjusted multiple testing via the factorial HMM

Let \(\{\gamma _i\}^m_{i=1}\) be the existence status of covariate effects, where \(\gamma _i=1\) indicates \(z_i\) is affected by external covariate effects and \(\gamma _i=0\) otherwise. Assume that \(\{\theta _i\}^m_{i=1}\) and \(\{\gamma _i\}^m_{i=1}\) are mutually independent, and each follows a stationary, irreducible and aperiodic Markov chain, respectively. The corresponding transition probabilities are

$$\begin{aligned} a_{ij}=P(\theta _s=j|\theta _{s-1}=i), \end{aligned}$$
(1)

and

$$\begin{aligned} b_{ij}=P(\gamma _s=j|\gamma _{s-1}=i). \end{aligned}$$
(2)

Moreover, assume that the observations \(\{z_i\}^m_{i=1}\) are conditionally independent given the hidden states \(\{\theta _i\}^m_{i=1}\) and \(\{\gamma _i\}^m_{i=1}\), namely,

$$\begin{aligned} P(\{z_i\}^m_{i=1}|\{\theta _i\}^m_{i=1},\{\gamma _i\}^m_{i=1})=\prod ^m_{i=1}P(z_i|\theta _i,\gamma _i). \end{aligned}$$
(3)

Following the two-component mixture model (Sun and Cai 2009; Wei et al. 2009; Xiao et al. 2013), we further assume the random variable \(Z_i\) (with respect to \(z_i\)) follows a four-component mixture model:

$$\begin{aligned} Z_i|\theta _i,\gamma _i\sim (1-\theta _i)(1-\gamma _i)F_{0,0}+(1-\theta _i)\gamma _iF_{0,1}+\theta _i(1-\gamma _i)F_{1,0}+\theta _i\gamma _iF_{1,1}, \nonumber \\ \end{aligned}$$
(4)

where \(F_{0,0}\), \(F_{0,1}\), \(F_{1,0}\), and \(F_{1,1}\) are conditional distributions of \(Z_i\) given \((\theta _i, \gamma _i)=(0,0), (0,1), (1,0)\), and (1, 1), respectively. The dependence model (1)–(3) is called factorial hidden Markov model and was also discussed by Ghahramani and Jordan (1997). The structure of the factorial HMM can be intuitively understood with a directed graph in Fig. 1.

Fig. 1
figure1

Graphical representation of the factorial hidden Markov model

In practice, it is natural to assume that \(F_{0,0}\) is the standard normal distribution N(0, 1), and \(F_{0,1}\), \(F_{1,0}\) and \(F_{1,1}\) are normal mixtures. Since our main idea is to adjust for the covariate effects in multiple testing, for simplicity, we only assume that \(F_{1,1}\) is a normal mixture and \(F_{0,1}\) and \(F_{1,0}\) are normal distributions. Extending to the settings where all alternatives are normal mixtures is straightforward, but requires additional computations. It is necessary to note that the number of components in the normal mixture, denoted by L, is usually unknown. As Sun and Cai (2009) suggested, we can use likelihood-based criteria, such as the Akaike or Bayesian information criterion (AIC or BIC) to select appropriate L. Denote by \({\mathcal {A}}=(a_{ij})_{2\times 2}\) and \({\mathcal {B}}=(b_{ij})_{2\times 2}\) the transition probability of \(\{\theta _i\}^m_{i=1}\) and \(\{\gamma _i\}^m_{i=1}\), where \(a_{ij}=P(\theta _s=j|\theta _{s-1}=i)\) and \(b_{ij}=P(\gamma _s=j|\gamma _{s-1}=i)\) for \(i=0, 1\), \(j=0, 1\). Let \(\pi =(\pi _0, \pi _1)\) and \({\tilde{\pi }}=({\tilde{\pi }}_0,{\tilde{\pi }}_1)\), where \(\pi _j=P(\theta _s=j)\) and \({\tilde{\pi }}_j=P(\gamma _s=j)\) are the stationary distributions of \(\{\theta _i\}^m_{i=1}\) and \(\{\gamma _i\}^m_{i=1}\), respectively. For convenience, let \({\mathcal {F}}=\{F_{00}, F_{01}, F_{10}, F_{11}\}\), then denote by \(\vartheta =({\mathcal {A}}, {\mathcal {B}}, \pi , {\tilde{\pi }}, {\mathcal {F}})\) the parameters of the covariate-adjusted multiple testing under a factorial HMM.

The relationship between covariate-adjusted multiple testing and weighted classification

Sun and Cai (2009) developed a compound decision theoretic framework for both the weighted classification problem and the multiple testing problem in an HMM. They had shown that under the monotone ratio condition (MRC), the optimal statistic in the weighted classification problem is equivalent to that in the multiple testing problem. Inspired by their fundamental work, we extend the MRC to the generalized monotone ratio condition (GMRC) for our covariate-adjusted multiple testing under a factorial HMM. It can be shown that the optimal statistics in the weighted classification problem and the covariate-adjusted multiple testing problem are still equivalent under the GMRC.

For simplify, let \({\mathbf {z}}=\{z_i\}^m_{i=1}\). Let \(\lambda \) be the relative cost of false positive to false negative. Consider the weighted classification problem with the loss function

$$\begin{aligned} L_\lambda (\theta , \delta )=\frac{1}{m}\sum ^m_{i=1}\{\lambda (1-\theta _i)\delta _i+\theta _i(1-\delta _i)\}, \end{aligned}$$
(5)

where \(\delta _i=1_{\{T_i({\mathbf {z}})<t\}},i=1,\dots ,m,\) is a classification rule and \(T_i({\mathbf {z}})\) is a classification statistic. Let \(G^{jk}_i(t)=P(T_i({\mathbf {z}})<t|\theta _i=j,\gamma _i=k)\) be the conditional cumulative distribution functions (CDFs) of \(T_i({\mathbf {z}})\) for \(j=0,1,k=0,1\). Let \(G^{jk}(t)=\frac{1}{m}\sum \limits ^m_{i=1} G^{jk}_i(t)\) be the average conditional CDFs of \(T_i({\mathbf {z}})\) and \(g^{jk}(t)=(d/dt)G^{jk}(t)\) be the average conditional probability density functions (PDFs) of \(T_i({\mathbf {z}})\). Define the generalized monotone ratio condition:

$$\begin{aligned} \frac{{\tilde{\pi }}_0g^{10}(t)+{\tilde{\pi }}_1g^{11}(t)}{{\tilde{\pi }}_0g^{00}(t)+{\tilde{\pi }}_1g^{01}(t)} \mathrm{\ is\ monotonely\ decreasing\ in\ } t. \end{aligned}$$
(6)

The GMRC can be viewed as a generalized version of the MRC in Sun and Cai (2009). It is easy to see the GMRC reduces to the MRC when there is no covariate affects the statistic. The detailed derivations are deferred to the Supplementary Material.

For clarity, denote \({\mathcal {T}}\) the collection of statistics satisfy the GMRC. The following theorem shows that the GMRC is an advisable condition for inference in the factorial HMM.

Theorem 1

Consider the factorial hidden Markov model (1)–(3). Let \(\delta \) be a decision rule of the form \(\delta ({\mathbf {T}},c)=(1_{(T_i({\mathbf {z}})<c)}:i=1,\dots ,m)\) with \(T_i({\mathbf {z}})\in {\mathcal {T}}\). Then

  1. (a)

    mFDR of \(\delta ({\mathbf {T}},c)\) is strictly increasing in the threshold c.

  2. (b)

    mFNR of \(\delta ({\mathbf {T}},c)\) is strictly decreasing in the threshold c.

  3. (c)

    In the weighted classification problem, the optimal cut-off c that minimizes the classification risk is strictly decreasing in \(\lambda \).

The form of the optimal classification statistic \(\varvec{\Lambda }\) in the weighted classification problem is given by the following theorem. Moreover, if \(\varvec{\Lambda } \in {\mathcal {T}}\), it is also the optimal testing statistic in the covariate-adjusted multiple testing problem.

Theorem 2

Consider the factorial hidden Markov model (1)–(3). Suppose that the true parameters \(\vartheta \) are known. Then the optimal classification rule which minimizes the expectation of the loss function in the weighted classification problem is \(\delta (\varvec{\Lambda },c)=(\delta _1,\dots ,\delta _m)\), where

$$\begin{aligned} \Lambda _i({\mathbf {z}})=\frac{P_{\vartheta }(\theta _i=0|{\mathbf {z}})}{P_{\vartheta }(\theta _i=1|{\mathbf {z}})} \end{aligned}$$

and \(\delta _i=1_{(\Lambda _i({\mathbf {z}})<1/\lambda )}\) for \(i=1,\dots ,m\). Moreover, if \(\varvec{\Lambda } \in {\mathcal {T}}\), then \(\varvec{\Lambda }\) is also the optimal statistic in the covariate-adjusted multiple testing problem in the sense that, for each mFDR at level \(\alpha \), there is a unique \(c(\alpha )\) such that \(\delta (\varvec{\Lambda },c(\alpha ))\) controls mFDR at level \(\alpha \) with the smallest mFNR among all \(\alpha \)-level testing rule with the testing statistic satisfies GMRC.

Since the proofs of Theorems 1 and 2 are analogous to those in Sun and Cai (2007), we omit the the proofs here. The next theorem will show that the optimal classification statistic \(\varvec{\Lambda }\) belongs to GMRC class \({\mathcal {T}}\) indeed.

Theorem 3

Consider the optimal classification statistic \(\varvec{\Lambda }\) in Theorem 2. Let \(G^{jk}_i(t)=P(\varvec{\Lambda }_i({\mathbf {z}})<t|\theta _i=j,\gamma _i=k),\) \(G^{jk}(t)=\frac{1}{m}\sum \limits ^m_{i=1} G^{jk}_i(t)\), and \(g^{jk}(t)=(d/dt)G^{jk}(t)\) for \(j=0,1,k=0,1\). Then we have

$$\begin{aligned} \frac{{\tilde{\pi }}_0g^{10}(t)+{\tilde{\pi }}_1g^{11}(t)}{{\tilde{\pi }}_0g^{00}(t)+{\tilde{\pi }}_1g^{01}(t)}=\frac{1}{t}{{\varvec{\cdot }}}\frac{\pi _0}{\pi _1}, \end{aligned}$$

namely, \(\varvec{\Lambda }\) belongs to GMRC class.

The CALIS procedure under a factorial HMM

It has been shown that the optimal classification statistic \(\varvec{\Lambda }\) is also optimal for covariate-adjusted multiple testing problem in the previous section. Note that \(\varvec{\Lambda }_i({\mathbf {z}})\) is increasing with \(P_{\vartheta }(\theta _i=0|{\mathbf {z}})\). Hence, an optimal covariate-adjusted multiple testing rule in a factorial HMM can be written in the form of \(\delta _i=1_{\{P_{\vartheta }(\theta _i=0|{\mathbf {z}})<t\}},i=1,\dots ,m\). We define the covariate-adjusted local index of significance (CALIS) for hypothesis \(H_i\) as

$$\begin{aligned} CALIS_i=P_{\vartheta }(\theta _i=0|{\mathbf {z}}). \end{aligned}$$

It is important to point out that the definitions of CALIS and LIS may seem similar on the surface, however, there are fundamental differences between CALIS and LIS on their implications and calculations. First, the different subscripts (\(\varphi \) and \(\vartheta \)) indicate that LIS and CALIS are based on different dependence models (HMM and factorial HMM). As illustrated in the previous section, the factorial HMM which can accommodate covariate-adjustment is more reasonable and flexible. Second, CALIS can be expressed into two parts, namely,

$$\begin{aligned} P_{\vartheta }(\theta _i=0|{\mathbf {z}})=P_{\vartheta }(\theta _i=0, \gamma _i=0|{\mathbf {z}})+P_{\vartheta }(\theta _i=0, \gamma _i=1|{\mathbf {z}}), \end{aligned}$$

where each part in the right hand can be effectively calculated by using a modified forward–backward algorithm. This implies that we indeed adjust for the effects of covariates when calculating CALIS, which is exactly what we want to do for large-scale multiple testing under dependence in this paper.

Given the optimal covariate-adjusted multiple testing statistic, CALIS, next we need to derive a suitable cut-off with respect to CALIS. Since the derivations of searching for the suitable cut-off are the same with Sun and Cai (2009), Genovese and Wasserman (2004) and Newton et al. (2004), we omit the details of the derivations.

Given the parameters \(\vartheta \) of the factorial HMM, denote by \(CALIS_{(1)},CALIS_{(2)}\) \(,\dots ,CALIS_{(m)}\) the ordered CALIS values and \(H_{(1)},H_{(2)},\dots ,H_{(m)}\) the corresponding null hypotheses. The oracle CALIS procedure (\(\vartheta \) are known) operates as follows:

$$\begin{aligned} \mathrm{Let}\ k=\max \left\{ i:\frac{1}{i}\sum ^i_{j=1}CALIS_{(j)}\le \alpha \right\} ,\ \mathrm{then\ reject\ }H_{(i)},i=1,\dots ,k. \end{aligned}$$
(7)

The next theorem shows that the oracle CALIS procedure is valid, namely, it controls FDR at the pre-specified level.

Theorem 4

Consider the factorial hidden Markov model (1)–(3). Then the oracle testing procedure (7) controls FDR at \(\alpha \).

In reality, the parameters \(\vartheta \) of the factorial HMM are typically unknown. We use the plug-in \({\widehat{CALIS}}_i, i = 1,\dots ,m\) in the data-driven procedure by replacing \(\vartheta \) with its MLE \({\hat{\vartheta }}\). Similarly, the data-driven CALIS procedure with unknown parameters \(\vartheta \) operates as follows:

  • Calculate the plug-in \({\widehat{CALIS}}_i=P_{{\hat{\vartheta }}}(\theta _i=0|{\mathbf {z}})\) value, where \({\widehat{\vartheta }}\) can be obtained by using EM algorithm.

  • Rank the plug-in \({\widehat{CALIS}}\) values. Denote by \({\widehat{CALIS}}_{(1)},\dots ,{\widehat{CALIS}}_{(m)}\) the ordered \({\widehat{CALIS}}\) values and \(H_{(1)},\dots ,H_{(m)}\) the corresponding null hypotheses.

  • Let \(l=\max \left\{ i:\frac{1}{i}\sum ^i_{j=1}{\widehat{CALIS}}_{(j)}\le \alpha \right\} \). Then reject all \(H_{(i)}\), for \(i=1,\dots ,l. ~~~~~\)

Next, we will show that the oracle CALIS procedure and the data-driven CALIS procedure are asymptotically equivalent under some standard assumptions on the factorial HMM. A detailed illustration of these assumptions can be found in the literature (Bickel et al. 1998; Leroux 1992).

Assumption 1

The hidden states \(\{\theta _i\}_{i=1}^m\) and \(\{\gamma _i\}_{i=1}^m\) are mutually independent, and each follows a irreducible, aperiodic and stationary Markov chain, respectively. They are characterized by \(\vartheta _0=({\mathcal {A}}_0, {\mathcal {B}}_0, {\pi }_0, \tilde{{{\pi }}}_0, {\mathcal {F}}_0)\). \(\vartheta _0\) is an interior point of the parameter space \(\varTheta \).

Assumption 2

Denote by \({\mathcal {A}}_{\vartheta }=(a_{ij}(\vartheta ))\) and \({\mathcal {B}}_{\vartheta }=(b_{ij}(\vartheta ))\) the transition matrices and \(\pi _\vartheta =(\pi _0(\vartheta ), \pi _1(\vartheta ))\) and \({\tilde{\pi }}_{\vartheta }=({\tilde{\pi }}_0(\vartheta ),{\tilde{\pi }}_1(\vartheta ))\) the stationary distributions of the hidden states \(\{\theta _i\}_{i=1}^m\) and \(\{\gamma _i\}_{i=1}^m\), respectively. There are \(\gamma >0\) and \(\epsilon _0>0\) such that, for all \(|\vartheta -\vartheta _0|<\gamma \) and all \(i,j=0,1,\) \(a_{ij}(\vartheta )\ge \epsilon _0>0\), \(b_{ij}(\vartheta )\ge \epsilon _0>0\), \(\pi _i(\vartheta )\ge \epsilon _0>0\) and \({\tilde{\pi }}_i(\vartheta )\ge \epsilon _0>0\).

Assumption 3

There is a \(\gamma >0\) such that \(P(\rho _0(Z_1)|\theta _1=j)<1\) for all j, where

$$\begin{aligned} \rho _0(z)=\sup \limits _{|\vartheta -\vartheta _0|<\gamma }\max \limits _{0\le j,t\le 1}\left\{ \frac{\sum ^1_{k=0}P_{\vartheta }(Z_1=z|\theta _{1}=j,\gamma _{1}=k){\tilde{\pi }}_k}{\sum ^1_{k=0}P_{\vartheta }(Z_1=z|\theta _{1}=t,\gamma _{1}=k){\tilde{\pi }}_k} \right\} . \end{aligned}$$

Assumption 4

\({\widehat{\vartheta }}\) is a consistent estimate of \(\vartheta _0\).

Assumption 5

\(\sum \nolimits ^1_{k=0}P_{\vartheta }(Z_1=z|\theta _{1}=j,\gamma _{1}=k){\tilde{\pi }}_k\) for \(j=0,1\) are continuous and positive over the real line, and

$$\begin{aligned} \inf \limits _z\left\{ \sum \limits ^1_{k=0}P_{\vartheta }(z|\theta _{1}=0,\gamma _{1}=k){\tilde{\pi }}_k\bigg /\sum \limits ^1_{k=0}P_{\vartheta }(z|\theta _{1}=1,\gamma _{1}=k){\tilde{\pi }}_k\right\} =0, \end{aligned}$$

for all \(|\vartheta -\vartheta _0|<\gamma \).

Theorem 5

Consider the factorial HMM (1)–(3). Let \(FDR_{CALIS}^{OR}\), \(FDR_{CALIS}\), and \(FNR_{CALIS}^{OR}\), \(FNR_{CALIS}\) be the FDR levels and FNR levels that are yielded by oracle CALIS and data-driven CALIS procedures, respectively. If assumptions 1–5 hold, then \(FDR_{CALIS}^{OR}-FDR_{CALIS}\rightarrow 0\), as \(m\rightarrow \infty \). In addition, if at least a fixed proportion of hypotheses are not rejected, then \(FNR_{CALIS}^{OR}-FNR_{CALIS}\rightarrow 0\), as \(m\rightarrow \infty \).

The forward–backward algorithm for computing CALIS

According to the definition of the CALIS, it can be expressed as:

$$\begin{aligned} CALIS_k=\frac{\sum _{q=0}^1\alpha _k(0,q)\beta _k(0,q)}{\sum ^1_{p=0}\sum ^1_{q=0}\alpha _k(p,q)\beta _k(p,q)}, \end{aligned}$$

where the forward variable \(\alpha _k(p,q)=P_{\vartheta }(\{z_i\}^k_{i=1},\theta _k=p,\gamma _k=q)\) and backward variable \(\beta _k(p,q)=P_{\vartheta }(\{z_i\}^m_{i=k+1}|\theta _k=p,\gamma _k=q)\). With a few minor modifications on the forward–backward algorithm (Baum et al. 1970), we can obtain

$$\begin{aligned} \alpha _1(p,q)=\pi _p{\tilde{\pi }}_qf_{p,q}(z_1);~~\alpha _{k+1}(u,v)=\sum _{p=0}^1\sum _{q=0}^1\alpha _{k}(p,q)f_{u,v}(z_{k+1})a_{pu}b_{qv}, \end{aligned}$$

and

$$\begin{aligned} \beta _m(p,q)=1;~~\beta _k(p,q)=\sum _{u=0}^1\sum _{v=0}^1\beta _{k+1}(u,v)f_{u,v}(z_{k+1})a_{pu}b_{qv}, \end{aligned}$$

where \(f_{p,q}\) is the probability density with respect to \(F_{p,q}\), for \(p,q=0,1\).

The EM algorithm for calculating parameters of the factorial HMM

In this section, we give a detailed EM algorithm for calculating parameters of the factorial HMM when the number of components \(L = 1\) in Table 1. It is easy to extend it to the case of \(L \ge 2\), but requires more complex notations.

Table 1 The EM algorithm for calculating parameters of the factorial HMM

Simulation studies

In this section, we conduct a series of simulation studies to evaluate the numerical performance of our CALIS procedures, including the oracle CALIS procedure and the data-driven CALIS procedure. The simulations are divided into two scenarios in terms of mechanisms for generating observed values. In Scenario 1, the observed values are generated from the factorial HMM described in Sect. 2.2. In order to simulate a more realistic dependence structure among SNPs, we simulate genotypes based on the HapMap3 dataset, generate case-control subjects via a logistic regression model, and calculate the observed value of each genetic locus through existing test methods in Scenario 2. It is necessary to point out that the number of components L is known in settings of Scenario 1, whereas L is completely unknown for Scenario 2 and we use BIC for selection.

Scenario 1: the factorial HMM dependence structure

In this scenario, the simulation results are based on 100 replications and the number of hypotheses that we mainly focus on is set to be 3000. Consider the factorial HMM, as described in Sect. 2.2. The states of primary hypotheses \(\{\theta _i\}^{3000}_{i=1}\) are generated with the transition matrix \({\mathcal {A}}=(0.95,0.05;0.1,0.9)\) and the initial distribution is set to be (0.95, 0.05). Similarly, the existence status of covariate effects \(\{\gamma _i\}^{3000}_{i=1}\) are generated with the transition matrix \({\mathcal {B}}=(0.9,0.1;0.05,0.95)\) and the initial distribution is set to be (0.8, 0.2). The observations \(\{z_i\}_{i=1}^{3000}\) are generated from the four-component mixture model (4) described in Sect. 2.2, where \(F_{0,0}\sim N(0, 1)\), \(F_{0,1}\sim N(-1, 1)\), \(F_{1,0}\sim N(1, 1)\), and \(F_{1,1}\) is a normal mixture. Here we assume that the number of components L in \(F_{1,1}\) is known,and further divide this simulation into two cases with different values of the number of components in the normal mixture (\(L=1\) and 2). In essence, the case with \(L=1\) is equivalent to that all alternatives (\(F_{0,1}, F_{1,0}\) and \(F_{1,1}\)) are normal distributions.

Case 1 (\(L=1\)): \(F_{1,1}\sim N(\mu _1, 1)\)

In this case, we vary \(\mu _1\) from 1 to 5 with an increment 0.5 and exhibit the simulation results in Fig. 2.

Fig. 2
figure2

Simulation results in Case 1 of Scenario 1. a The FDR levels of all four procedures are controlled at 0.1 consistently, and the LIS procedures are always conservative. b The FNR of CALIS procedures are smaller than those of LIS procedures with the same \(\mu _1\) value. The FDR and FNR yielded by the oracle CALIS procedure and data-driven CALIS procedure are almost the same, and the similar conclusion is also true for LIS procedures

In Fig. 2, we can see from panel (a) that all four procedures control FDR levels at the pre-specified level 0.1 consistently. However, the LIS procedures (LIS.or and LIS.dd; the oracle and data-driven LIS procedures) are always conservative with a low FDR level around 0.02. The results in panel (b) illustrate that: (1) the FNR values yielded by the CALIS procedures (CALIS.or and CALIS.dd; the oracle and data-driven CALIS procedures) are almost the same; (2) the same holds true for the LIS procedures; (3) the FNR values of the LIS procedures are much higher compared with those of the CALIS procedures; (4) the FNR value of the CALIS procedures is decreasing when \(\mu _1\) varies from 1 to 5. The foregoing results indicate that our CALIS procedures are valid and significantly outperform their competitors by exploiting the information of the covariate effects properly.

In addition, we provide an explanation of the conservative of the LIS procedures under the preceding model setting by focusing on the special case that the tests are independent (Efron et al. 2001), and put detailed explanations into the Supplementary Material for the sake of coherence.

Case 2 (\(L=2\)): \(F_{1,1}\sim 0.5N(3,1)+0.5N(\mu _2,1)\)

In this case, we vary \(\mu _2\) from 1 to 3 with an increment 0.25 and exhibit the simulation results in Fig. 3.

In Fig. 3, we can observe from panel (a) that all four procedures control FDR at the nominal level 0.1 approximately. Although the data-driven CALIS procedure possess the largest FDR, it is still acceptable (FDR=0.107). The LIS procedures are conservative and lead to a small FDR value around 0.05. We can also see from panel (b) that the FNR yielded by the CALIS procedures are nearly overlapped and uniformly dominate those of the LIS procedures. When \(\mu _2\) is relatively large, we can see that the data-driven LIS procedure outperforms the oracle LIS procedure. This may be due to a higher FDR level of the LIS procedure. By and large, the numerical results almost coincide with those in Case 1.

Fig. 3
figure3

Simulation results in Case 2 of Scenario 1

It is worth to pay attention that the higher power of our procedures is not gained at the cost of a higher FDR level. To account for this point, we further evaluate the sensitivities yielded by these procedures at different FDR levels for the fixed \(\mu _1\) under the setting of Case 1, where the sensitivities are calculated as the average proportions of correctly identified SNPs over 100 replications. The results are shown in Fig. 4. We can observe that: (1) the sensitivity curves of the CALIS procedures are almost overlapped; (2) the same holds true for the LIS procedures except for the case that \(\mu _1=1\); (3) the sensitivity values of our procedures are consistently superior than those of LIS procedures. The above results imply that our CALIS procedures enjoy a higher multiple testing efficiency compared with the LIS procedures at the same FDR level.

Fig. 4
figure4

A further simulation study in Scenario 1. The sensitivities are calculated as the average proportions of correctly identified SNPs over 100 replications. We can observe that the sensitivity curves of the CALIS procedures are almost overlapped, and the same holds true for LIS procedures except for the case that \(\mu _1= 1\). The sensitivity values of our procedures are consistently superior than those of LIS procedures

Scenario 2: the more realistic SNP-dependence structure

In order to further compare the numerical performance of our CALIS procedure (CALIS; the CALIS procedure using the covariate-adjusted observed values) and the LIS procedures using the covariate-unadjusted and covariate-adjusted observed values (the LIS procedure and the LIS.cov procedure) with more realistic LD patterns among SNPs, we generate a genotype pool by randomly matching the 340 haplotypes from the subjects of JPT (Japanese in Tokyo, Japan) and CHB (Han Chinese in Beijing, China) collected by HapMap3. To be fair, we also take into account the CALIS procedure using the covariate-unadjusted observed values (the CALIS.uncov procedure) in the comparisons. Here the covariate-adjusted and covariate-unadjusted observed values are, respectively, calculated by using the covariate-adjusted association test (Jiang and Zhang 2011) and the covariate-unadjusted association test (Zhang et al. 2010). To focus on the main points, we restrict attention to a region of the first chromosome which consists of 1000 SNPs. Four SNPs are selected as the disease-associated SNPs (with relative risk 1.5), among which two SNPs are far away and the other two SNPs are close (separated by 3 SNPs). In addition, we consider two environmental factors, namely, a continuous covariate \(E_{co}\) generated from N(0, 1) and a categorical covariate \(E_{ca}\) generated from binomial distribution B(1, 0.5). The phenotype Y is generated according to a logistic regression model:

$$\begin{aligned} P(Y=1|G,E_{co},E_{ca})=\frac{\exp (\beta _0+\beta ^{T}G+\gamma _1E_{ca}+\gamma _2E_{co})}{1+\exp (\beta _0+\beta ^{T}G+\gamma _1E_{ca}+\gamma _2E_{co})}, \end{aligned}$$

where \(\beta ={(\beta _1,\beta _2,\beta _3,\beta _4)}^{T}, G={(G_1,G_2,G_3,G_4)}^{T}\) and \(G_i\) is the corresponding genotype of the i-th causal SNPs. We set \(\beta _1=\beta _2=\beta _3=\beta _4= \log (1.5)\) and consider the following settings with different \(\gamma _1\) and \(\gamma _2\).

  1. (a)

    Setting 1: \(\gamma _1=3.5,\gamma _2=3.5\).

  2. (b)

    Setting 2: \(\gamma _1=4,\gamma _2=4\).

  3. (c)

    Setting 3: \(\gamma _1=4.5,\gamma _2=4.5\).

Correspondingly, \(\beta _0\) is set to be \(-11.45, -12.75\), and \(-14\) so that the prevalence of the disease is controlled at 0.02. For each setting, we repeatedly generate the disease status for each individual until we obtain 1000 cases and 1000 controls. The four disease-associated SNPs are removed from our simulated data set. Then, the twenty-one SNPs which comprise the 3 adjacent SNPs on each side of the 4 disease-causal SNPs are defined as relevant SNPs. We evaluate the implementing of a testing procedure by selection rate of relevant SNPs. As mentioned earlier, the number of components L is unknown in this scenario and it is selected by BIC. The simulation is repeated for 100 times and the results for the above model settings are displayed in Figs. 5, 6 and 7, respectively.

Fig. 5
figure5

The simulation results of Setting 1 in Scenario 2. a The sensitivity (Here the sensitivity is calculated as the percentages of true positives that are selected by the top K SNPs) is increasing when the top K SNPs varies from 0 to 300. b The ROC curve is created by plotting the true positive rate (TPR) against the false positive rate (FPR) at different threshold settings over 100 replications. It is apparent to see the sensitivity and ROC curves yielded by the CALIS procedure dominate those of its competitors

Fig. 6
figure6

The simulation results of Setting 2 in Scenario 2. The results almost coincide with those of Setting 1 in Scenario 2

Fig. 7
figure7

The simulation results of Setting 3 in Scenario 2. The results almost coincide with those of Setting 1 in Scenario 2

From Figs. 5a, 6a and 7a, we can see that the sensitivity (defined as the percentages of true positives that are selected by the top K SNPs) is increasing when the top K SNPs varies from 0 to 300. The sensitivity yielded by the CALIS procedure is uniformly larger than those of LIS procedures and the CALIS.uncov procedure. This indicates that our CALIS procedure achieves higher ranking efficiency and can discover more true positives at the same number of rejections. It is interesting to note that the difference of the sensitivities between the CALIS procedure and the LIS procedures is increasing in the values of \(\gamma _1\) and \(\gamma _2\). This illustrates that utilizing the covariate-adjustment is helpful especially when the covariate effect is large. We can also observe that the sensitivity yielded by the CALIS procedure dominates that of the CALIS.uncov procedure. For such LIS procedures (LIS.cov and LIS), the LIS.cov procedure is preferable. This reveals that taking into account the covariate effect is helpful in multiple testing.

From Figs. 5b, 6b and 7b, we can see an alternative measure of ranking efficiency, namely, the ROC curve. Here, the ROC curve is created by plotting the true positive rate (TPR) against the false positive rate (FPR) at different threshold settings over 100 replications. It is apparent to see the ROC curves yielded by the CALIS procedure dominate those of its competitors and the results almost coincide with those in Scenario 1.

To evaluate the numerical performance of the CALIS procedure even if there is no covariate effect, we conduct additional simulation studies in the setting where \(\gamma _1=\gamma _2=0\). We set \(\beta _1=\beta _2=\beta _3=\beta _4= \log (1.5)\) and \(\beta _0=-4.8\) so that the the prevalence of the disease is controlled at 0.02. Due to the absence of covariate effects, we choose the LIS procedure using the covariate-unadjusted association test (Zhang et al. 2010) as a benchmark. The simulation is repeated for 100 times, and the results are displayed in Fig. 8. We can observe that the CALIS procedure performs well when the top K or FPR is small although it is somewhat conservative overall for \(\gamma _1=\gamma _2=0\). Interestingly, the sensitivity or the ROC curve of the CALIS.uncov procedure is not much different from that of the LIS procedure. This suggests that the CALIS.uncov procedure can be served as an alternative when there is no covariate effect.

Fig. 8
figure8

The simulation results of the setting \(\gamma _1=\gamma _2=0\)

Application to bipolar disorder datasets

Bipolar disorder (BD) is a manic depressive illness that causes periods of depression and periods of elevated mood. There are a series of evidences for substantial genetic and environmental contributions to the risk of BD (Merikangas et al. 1998). However, the pathogenic mechanism of BD is not clearly understood. With the purpose of identifying SNPs associated with BD while adjusting for covariates, we apply our new procedure to an analysis of BD datasets. The datasets were collected by Wellcome Trust Case Control Consortium (WTCCC) and contain 1998 cases and 3004 controls, among which there are 1504 control samples from the 1958 Birth Cohort (58C) and 1500 control samples from UK Blood Service (UKBS). The study subjects were genotyped by using GeneChip 500k arrays at the Affymetrix Services Lab. In addition, we consider gender and age at recruitment as covariates.

A series of procedures are performed for quality control (QC), before real data analysis. We exclude 130 samples form the BD cohort, 24 samples form the 58C cohort and 42 samples from the UKBS cohort owing to the high missing rate, overall heterozygosity and non-European ancestry. In addition, we eliminate the SNPs in accordance with the exclusion list provided by WTCCC. Furthermore, the SNPs with minor allele frequency less than 5% are also excluded.

It has been reported that fifteen SNPs are showing to be associated with BD, where thirteen SNPs are showing moderate evidence of association with BD by Burton et al. (2007), the others (rs7680321 and rs11089599; Krystal et al. 2002 and Vawter et al. 2002) are related to GABA neurotransmission in GABRB1 and synaptic function in SYN3, respectively. Note that all of the suspected SNPs are located on the ten different chromosomes. To illustrate the main point, we only perform our novel procedure on the ten chromosomes separately and make a comparison with the classical LIS procedure (the LIS.cov procedure). Likewise, we use the BIC to select the number of components L. The observed values are calculated by using the covariate-adjusted association test proposed by Jiang and Zhang (2011). It has been shown that ranking LIS.cov values across all chromosomes can achieve more testing efficiency (Wei et al. 2009). Hence, we first calculate chromosome-specific CALIS values and LIS.cov values, and then rank the CALIS values and LIS.cov values across all ten chromosomes. The detailed results are shown in Table 2. The FDR level is set to be \(1\times 10^{-7}\) for both CALIS procedure and LIS.cov procedure. There are 249 SNPs identified by the CALIS procedure and 202 SNPs identified by the LIS.cov procedure from a total of 182,072 SNPs. Furthermore, among the 15 suspected SNPs detected by Burton et al. (2007), six of them are identified by the CALIS procedure while only one (rs1344484) is identified by the LIS.cov procedure. Note that a smaller value of CALIS values or LIS.cov values indicates the SNP is more likely to be associated with the corresponding disease. To be specific, in Table 2, we can observe that the CALIS value is uniformly smaller than the LIS.cov value for each identified SNP. This implies that taking into account the covariate-adjustment, the CALIS procedure achieves more evidences of association between suspected SNPs and BD. We can also see that our CALIS values uniformly enjoy smaller rankings compared with the LIS.cov values. This illustrates that the CALIS procedure enjoys more efficient rankings of SNPs that are associated with BD.

Table 2 Results of LIS.cov and CALIS procedures at FDR-level = \(1\times 10^{-7}\)

We further apply the LIS procedure and the CALIS procedure with the covariates (gender and age) are excluded in this real data analysis, which are, respectively, denoted by the LIS procedure and the CALIS.uncov procedure. The observed values in this case are calculated by employing the covariate-unadjusted association test (Zhang et al. 2010). There are 66 SNPs identified by the CALIS.uncov procedure and 64 SNPs identified by LIS procedure. Furthermore, among those 15 suspected SNPs, one of them is identified by the CALIS.uncov procedure, while none is identified by the LIS procedure. The detailed results are listed in Supplementary Material. Based on the results of Table 2 and Table S1 in Supplementary Material, we can conclude that accommodating covariate adjustment in multiple testing can improve the chance of identifying disease-related SNPs.

Discussion

In this paper, we propose a covariate-adjusted multiple testing procedure based on a factorial HMM. The new procedure can adjust for covariate effects when detecting the major interests associated with the outcomes. The theoretical results show that our procedure is valid and optimal when taking into account covariate effects. Simulations and real data analysis show that the efficiency of multiple testing can be substantially improved by employing our new procedure.

Our novel procedure can be extended in several ways. First, it might be a strong assumption that the transition probabilities are invariant. Kuan and Chiang (2012) developed a multiple testing procedure based on the nonhomogeneous HMM and allowed for exogenous information to be incorporated systematically. This approach may give rise to a bright way to address this issue. However, a new problem will arise when using nonhomogeneous HMM to characterize the dependence structure in the tests. To the best of our knowledge, the consistency of the estimates of nonhomogeneous HMM has not been investigated. Second, in practice, we would like to discover the SNPs together with the environmental factors, and SNP-environmental interactions which are truly associated with the disease. It is meaningful to develop a multiple testing procedure which can test these effects simultaneously while allow for dependence structure included.

Another more interesting question is how to relax or remove GMRC for our CALIS procedure, which makes it possible to be more widely used in practice. Although GMRC is a straightforward extension of MRC proposed by Sun and Cai (2009), to preserve the equivalence between the optimal statistics in the weighted classification problem and the covariate-adjusted multiple testing problem without GMRC is also very challenging for multiple testing under dependence. Recently, Cai et al. (2019) creatively developed the covariate-assisted ranking and screening (CARS) procedure for large-scale two-sample inference. They have shown that the CARS procedure controls the mFDR with the largest expected number of true positives (ETP) without MRC, which seems to open the possibility to the above problem. However, as Cai et al. (2019) pointed out that the theoretical results of CARS cannot be applied to dependent tests directly. Recognizing all this, we plan to pursue this problem in our further research.

References

  1. Baum LE, Petrie T, Soules G, Weiss N (1970) A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains. Ann Math Stat 41(1):164–171

    MathSciNet  Article  Google Scholar 

  2. Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Ser B (Methodol) 57:289–300

    MathSciNet  MATH  Google Scholar 

  3. Benjamini Y, Hochberg Y (2000) On the adaptive control of the false discovery rate in multiple testing with independent statistics. J Educ Behav Stat 25(1):60–83

    Article  Google Scholar 

  4. Bickel PJ, Ritov YA, Ryden T (1998) Asymptotic normality of the maximum-likelihood estimator for general hidden Markov models. Ann Stat 26(4):1614–1635

    MathSciNet  Article  Google Scholar 

  5. Burton PR, Clayton DG, Cardon LR, Craddock N, Deloukas P, Duncanson A, Kwiatkowski DP, Mccarthy M, Iand Ouwehand WH, Samani NJ (2007) Genome-wide association study of 14,000 cases of seven common diseases and 3000 shared controls. Nature 447(7145):661–78

    Article  Google Scholar 

  6. Cai TT, Sun W, Wang W (2019) Covariate-assisted ranking and screening for large-scale two-sample inference. J R Stat Soc Ser B (Methodol) 81(2):187–234

    MathSciNet  Article  Google Scholar 

  7. Efron B, Tibshirani R, Storey JD, Tusher V (2001) Empirical Bayes analysis of a microarray experiment. J Am Stat Assoc 96(456):1151–1160

    MathSciNet  Article  Google Scholar 

  8. Genovese C, Wasserman L (2002) Operating characteristics and extensions of the false discovery rate procedure. J R Stat Soc Ser B (Methodol) 64(3):499–517

    MathSciNet  Article  Google Scholar 

  9. Genovese C, Wasserman L (2004) A stochastic process approach to false discovery control. Ann Stat 32(3):1035–1061

    MathSciNet  Article  Google Scholar 

  10. Ghahramani Z, Jordan MI (1997) Factorial hidden Markov models. Mach Learn 29(2–3):245–273

    Article  Google Scholar 

  11. Jiang Y, Zhang H (2011) Propensity score-based nonparametric test revealing genetic variants underlying bipolar disorder. Genetic Epidemiol 35(2):125–132

    Article  Google Scholar 

  12. Krystal JH, Sanacora G, Blumberg H, Anand A, Charney DS, Marek G, Epperson CN, Goddard A, Mason GF (2002) Glutamate and gaba systems as targets for novel antidepressant and mood-stabilizing treatments. Mol Psychiatry 7(1):S71

    Article  Google Scholar 

  13. Kuan PF, Chiang DY (2012) Integrating prior knowledge in multiple testing under dependence with applications to detecting differential dna methylation. Biometrics 68(3):774–783

    MathSciNet  Article  Google Scholar 

  14. Lei L, Fithian W (2018) Adapt: an interactive procedure for multiple testing with side information. J R Stat Soc Ser B (Methodol) 80(4):649–679

    MathSciNet  Article  Google Scholar 

  15. Leroux BG (1992) Maximum-likelihood estimation for hidden Markov models. Stochc Process Their Appl 40(1):127–143

    MathSciNet  Article  Google Scholar 

  16. Liang K, Nettleton D (2010) A hidden Markov model approach to testing multiple hypotheses on a tree-transformed gene ontology graph. J Am Stat Assoc 105(492):1444–1454

  17. Liang K, Du C, You H, Nettleton D (2018) A hidden Markov tree model for testing multiple hypotheses corresponding to gene ontology gene sets. BMC Bioinf 19(1):107

  18. Liu J, Zhang C, Page D (2016) Multiple testing under dependence via graphical models. Ann Appl Stat 10(3):1699–1724

    MathSciNet  Article  Google Scholar 

  19. Merikangas KR, Mehta RL, Molnar BE, Walters EE, Swendsen JD, Aguilar-Gaziola S, Bijl R, Borges G, Caraveo-Anduaga JJ, Dewit D (1998) Comorbidity of substance use disorders with mood and anxiety disorders: results of the international consortium in psychiatric epidemiology. Addict Behav 23(6):893–907

    Article  Google Scholar 

  20. Newton MA, Noueiry AO, Sarkar D, Ahlquist P (2004) Detecting differential gene expression with a semiparametric hierarchical mixture method. Biostatistics 5(2):155–76

    Article  Google Scholar 

  21. Schork AJ, Thompson WK, Phillip P, Ali T, Cooper J, R, Sullivan PF, Kelsoe JR, O’Donovan MC, Helena F, Schork NJ, (2013) All SNPS are not created equal: genome-wide association studies reveal a consistent pattern of enrichment among functionally annotated SNPS. PloS Genet 9(4):e1003449

  22. Shu H, Nan B, Koeppe R (2015) Multiple testing for neuroimaging via hidden Markov random field. Biometrics 71(3):741–750

    MathSciNet  Article  Google Scholar 

  23. Storey JD (2002) A direct approach to false discovery rates. J R Stat Soc Ser B (Methodol) 64(3):479–498

    MathSciNet  Article  Google Scholar 

  24. Sun W, Cai TT (2007) Oracle and adaptive compound decision rules for false discovery rate control. J Am Stat Assoc 102(479):901–912

    MathSciNet  Article  Google Scholar 

  25. Sun W, Cai TT (2009) Large-scale multiple testing under dependence. J R Stat Soc Ser B (Methodol) 71(2):393–424

    MathSciNet  Article  Google Scholar 

  26. Sun W, Reich BJ, Cai TT, Guindani M, Schwartzman A (2015) False discovery control in large-scale spatial multiple testing. J R Stat Soc Ser B (Methodol) 77(1):59–83

    MathSciNet  Article  Google Scholar 

  27. Vawter MP, Thatcher L, Usen N, Hyde TM, Kleinman JE, Freed WJ (2002) Reduction of synapsin in the hippocampus of patients with bipolar disorder and schizophrenia. Mol Psychiatry 7(6):571

    Article  Google Scholar 

  28. Wang X, Ye Y, Zhang H (2006) Family-based association tests for ordinal traits adjusting for covariates. Genet Epidemiol 30(8):728–736

    Article  Google Scholar 

  29. Wei Z, Sun W, Wang K, Hakonarson H (2009) Multiple testing in genome-wide association studies via hidden Markov models. Bioinf 25(21):2802–2808

    Article  Google Scholar 

  30. Xiao J, Zhu W, Guo J (2013) Large-scale multiple testing in genome-wide association studies via; region-specific hidden Markov models. BMC Bioinf 14(1):282–282

  31. Zablocki RW, Schork AJ, Levine RA, Andreassen OA, Dale AM, Thompson WK (2014) Covariate-modulated local false discovery rate for genome-wide association studies. Bioinformatics 30(15):2098–2104

    Article  Google Scholar 

  32. Zablocki RW, Levine RA, Schork AJ, Xu S, Wang Y, Fan CC, Thompson WK (2017) Semiparametric covariate-modulated local false discovery rate for genome-wide association studies. Ann Appl Stat 11(4):2252–2269

    MathSciNet  Article  Google Scholar 

  33. Zhang H, Liu CT, Wang X (2010) An association test for multiple traits based on the generalized Kendall’s tau. J Am Stat Assoc 105(490):473–481

    MathSciNet  Article  Google Scholar 

  34. Zhu W, Jiang Y, Zhang H (2012) Nonparametric covariate-adjusted association tests based on the generalized Kendall’s tau. J Am Stat Assoc 107(497):1–11

    MathSciNet  Article  Google Scholar 

Download references

Acknowledgements

The authors are grateful to the editor, the associate editor, and two anonymous reviewers for their constructive comments that helped us improve the article substantially. This work is supported in part by the National Natural Science Foundation of China (no. 11771072 and 11371083); the Science and Technology Development Plan of Jilin Province (no. 20191008004TC). The authors also thank WTCCC for permission to use the GWAS data.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Wensheng Zhu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Cui, T., Wang, P. & Zhu, W. Covariate-adjusted multiple testing in genome-wide association studies via factorial hidden Markov models. TEST (2021). https://doi.org/10.1007/s11749-020-00746-8

Download citation

Keywords

  • Factorial hidden Markov model
  • Covariate adjustment
  • Multiple hypotheses testing
  • False discovery rate
  • GWAS

Mathematics Subject Classification

  • 62M02
  • 62P10
  • 62E20