In this section we provide a reply to some of the comments concerning modeling approaches that can be used to account for heterogeneous observations raised by Federico Castelletti, Guido Consonni and Luca La Rocca (CCLR) and Yize Zhao, Zhe Sun and Jian Kang (ZSK).
CCLR raise several important points regarding covariate-dependent graphical models, wherein the covariates provide additional information regarding the heterogeneity in the samples. As they alluded to (in their Section 2) the covariates can enter the models either through: (i) mean structure; (ii) directly in the graphical structure or (iii) both. Scenario (iii) is an excellent point: whether it would be possible to have covariates enter the graphical (regression) model through both mean and graph (i.e. precision matrix). It is indeed possible and meaningful in certain contexts. For instance, the graphical regression model reviewed in Section 4.1 can be extended as:
$$\begin{aligned} y_{lj} = \sum _{j=1}^{p} g_j(\varvec{x}_\ell ) + \sum _{k \in pa_l(j)}y_{lk}\theta _{jk}(\varvec{x}_\ell )+\epsilon _j, \end{aligned}$$
where
This extended model can be interpreted as a graphical model on both Y- and X- variables where X- variables are (potential) parents of Y-variables and their edge strengths do not change across observations whereas the edge strengths among Y-variables could vary with X-variables. The formulation of priors presented in Section 4.1 could be potentially extended for a coherent Bayesian specification of this model; however it does need some careful considerations regarding mean vs variance estimation trade-offs.
CCLR also raise an interesting future avenue of research regarding mixed/bidirected graphical models. Most methods in graphical regression, to the best of our knowledge, are (i) for continuous data and (ii) assume either an undirected (GGMs) or a directed (DAG) structure. Extension of these methods to discrete data can be done using latent variable models to model various types of discrete data e.g. binary, multi-category and count (or mixtures of these). However, this does raise some interpretational issues, since the graph is on a latent scale as opposed to the observed data scale (Bhadra et al. 2018). The bidirected graphical regression models case is an interesting one—since the conditioning would occur both at the edge and covariate level. The causal interpretations emerging from these graphs require some careful considerations given the interplay between the node level as well as edges level dependencies. Albeit, we do agree these are interesting avenues for future research.
CCLR as well as ZSK raise some interesting points regarding mixtures of graphs, i.e., presence of heterogeneous sub-groups/types in the data. While ZSK focus on the supervised case (where the subgroups are known), CCLR provide a nice summary on unsupervised learning of sub-groups as well as the graphical models. The former is a special case of the graphical regression models as presented in Section 4.1, where the covariate(s) are discrete. This essentially reduces the model to a multiple-graphical model, as nicely summarized by ZSK, albeit with an alternative characterization. The unsupervised case is an interesting one. Mixtures of graphs have been proposed in the literature using infinite mixtures mostly via Bayesian nonparameteric approaches (see CCLR and the references therein) or finite mixtures (e.g. Talluri et al. (2014)). However, we are not aware of any works that coherently allow mixtures to vary across multiple covariates and scales: discrete, binary or combinations of these. While this would accompany substantial modeling and computational challenges, we envision there could be many potential applications, where multiple covariates (e.g. demographic, clinical, molecular) are assayed on the same individuals (e.g. see case studies in Ha et al. (2018); Bhattacharyya et al. (2020)), which can inform mixtures of graphs, either directed or undirected.