Mechanism-based risk assessment is urged to advance and fully permeate into current safety assessment practices, possibly at early phases of drug safety testing. Toxicogenomics is a promising source of mechanisms-revealing data, but interpretative analysis tools specific for the testing systems (e.g. hepatocytes) are lacking. In this study, we present the TXG-MAPr webtool (available at https://txg-mapr.eu/WGCNA_PHH/TGGATEs_PHH/), an R-Shiny-based implementation of weighted gene co-expression network analysis (WGCNA) obtained from the Primary Human Hepatocytes (PHH) TG-GATEs dataset. The 398 gene co-expression networks (modules) were annotated with functional information (pathway enrichment, transcription factor) to reveal their mechanistic interpretation. Several well-known stress response pathways were captured in the modules, were perturbed by specific stressors and showed preservation in rat systems (rat primary hepatocytes and rat in vivo liver), with the exception of DNA damage and oxidative stress responses. A subset of 87 well-annotated and preserved modules was used to evaluate mechanisms of toxicity of endoplasmic reticulum (ER) stress and oxidative stress inducers, including cyclosporine A, tunicamycin and acetaminophen. In addition, module responses can be calculated from external datasets obtained with different hepatocyte cells and platforms, including targeted RNA-seq data, therefore, imputing biological responses from a limited gene set. As another application, donors’ sensitivity towards tunicamycin was investigated with the TXG-MAPr, identifying higher basal level of intrinsic immune response in donors with pre-existing liver pathology. In conclusion, we demonstrated that gene co-expression analysis coupled to an interactive visualization environment, the TXG-MAPr, is a promising approach to achieve mechanistic relevant, cross-species and cross-platform evaluation of toxicogenomic data.
Mechanism-based risk assessment is the gold standard for safety assessment of drugs and chemicals, but is resource- and time-intensive (Lanzoni et al. 2019). Establishing mechanisms generally requires extensive experimentation, leveraging literature data collected over years, and is often based on preclinical (animal) studies, which may or may not translate to human (Bailey and Balls 2019; Clark and Steger-Hartmann 2018; WHO-IPSC 2018). Approaches that rapidly reveal mechanistic detail of preserved responses between animal and human would advance next generation risk assessment and impact both predicting and monitoring toxicity (Rivetti et al. 2020). For example, early in development, drugs that cause preclinical toxicity are often discarded based on preclinical studies without knowing if a toxicity mechanism will translate to human, thus valuable compounds may be discarded. In addition, screening for safer compounds is hampered if testing assays do not interrogate compounds against a mechanism-related event and results are not interpreted in the context of organ-specific toxicity.
Liver is a primary target organ for drugs and chemical toxicants due to its role in metabolism and disposition (Zhang and Venkat 2020). Drug-induced liver injury (DILI) is a major cause of clinical liver failure (Björnsson 2019; Reuben et al. 2016), drug attrition and black box warnings (Onakpoya et al. 2016; Solotke et al. 2018; Watkins 2011). DILI manifests as a variety of clinical pathologies and may depend on genetic and environmental factors making its prediction a challenge for the current preclinical testing paradigm (Koido et al. 2020). Likewise, hepatotoxicity and hepatocarcinogenesis are major concerns for environmental exposures (Colombo et al. 2019; Yorita Christensen et al. 2013). Both non-genotoxic and genotoxic carcinogens often produce hepatoxicity prior to the emergence of liver tumors in longer term preclinical studies (Karin and Dhar 2016). In both situations, hepatotoxicity can be regarded as a multistep, multicellular disease process, where an initial molecular stress is followed by a series of cellular key events that couple the initial stress to an apical endpoint observable as a pathology, e.g. hepatotoxicity or liver tumors. However, in the absence of a mechanism that links cellular (stress) events to a pathology, risk assessment is typically based on the apical endpoint which can take months or years to develop.
Toxicogenomics, the study of transcriptome responses in toxicology, is a promising tool for a comprehensive analysis of toxicity to enable mechanism-based risk assessment (Liu et al. 2019). Current toxicogenomic approaches mainly rely on the analysis of Differentially Expressed Genes (DEGs) or enrichment analysis using annotated pathway analysis tools (Barel and Herwig 2018). Such an approach depends on ontologies with a high degree of redundancy and capture current knowledge of toxicology and general biology. These approaches are useful but biased toward genes that are well annotated and are not designed to interpret mechanisms of toxicity applied to specific testing systems (e.g. hepatocytes). For these and other reasons, and despite several decades of research, toxicogenomics has not produced the anticipated transformation of current safety assessment standard practice (Vahle et al. 2018).
It is known that groups of genes expressed downstream of a (stress-responsive) transcription factor will show co-expression (Yin et al. 2021). For this reason, gene co-expression analysis has been applied to toxicogenomic datasets for rat liver (Podtelezhnikov et al. 2020; Sutherland et al. 2018), but not for human hepatocytes up to now. These co-expressed gene sets mediate the cellular response to stress, providing mechanistic information on the cellular processes and key events involved in adaptation and progression. Herein, we used weighted gene co-expression network analysis (WGCNA) to identify sets of co-expressed genes (termed ‘modules’) using the large TG-GATEs toxicogenomic dataset for primary human hepatocytes (PHH) as a surrogate to address hepatotoxicity and gene expression in the human context (Igarashi et al. 2015; Zhang and Horvath 2005). Gene modules were deployed in an R-Shiny analysis framework accessible via an interactive website, the PHH TXG-MAPr (https://txg-mapr.eu/WGCNA_PHH/TGGATEs_PHH/) to facilitate rapid visualization and mechanistic interpretation of transcriptomic data. Gene modules were annotated with external annotation resources (e.g. pathway enrichment and transcription factor modulation), which are useful in identifying known stress response pathways and modules populated by novel genes responsive to specific stressors. Using endoplasmic reticulum (ER) stress as a case study, we show that canonical ER stress response genes are captured in gene co-expression modules, including novel guilt-by-association genes. We identified ER stress as an early event in cyclosporine A (CSA)-induced toxicity and showed that ER stress modules respond similarly between CSA and the prototypical ER stressor, tunicamycin. ER stress and other known stress response pathways, which were captured in the PHH modules, are conserved across species and test systems. Stress response modules correlate in clusters with defined biological functions triggered by compound exposure. We showed that datasets of different sources can be analyzed with the PHH TXG-MAPr tool for mechanistic interpretation. Finally, we leveraged our PHH WGCNA modules to evaluate donor-to-donor variability upon tunicamycin exposure and identified biological sources of variations. Our results show that co-expression analysis of toxicogenomic data using the PHH TXG-MAPr framework can support human-relevant next generation mechanism-based risk assessment.
Materials and methods
Gene expression data processing
TG-GATEs and GEO microarray data
Microarray data from primary human hepatocytes (PHH) in the TG-GATEs repository were downloaded (https://dbarchive.biosciencedbc.jp/en/open-tggates/download.html) and jointly normalized using the Robust Multi-array Average (RMA) method within the affy R package version 1.52.0 (Gautier et al. 2004). To map probe sets to Entrez IDs, the BrainArray chip description file (CDF) version 20 was used (http://brainarray.mbni.med.umich.edu/Brainarray/Database/CustomCDF/genomic_curated_CDF.asp, HGU133Plus2 array version). Under this annotation, every gene is defined by a single probe set. This resulted in 19,363 unique probe sets, each mapped to a single gene. The TG-GATES (TG) repository contains 941 PHH treatments, where each treatment is defined as a combination of compound, time and concentration. Samples treated with vehicles are available for each individual time point, thus resulting in batch effect removal (Grinberg et al. 2014). For each compound/timepoint combination, the limma R package version 3.30.13 (Ritchie et al. 2015) was used to calculate log2 fold-change values, which was performed by building a linear model fit including concentration levels in the design as covariate and computing the log-odds of differential expression by empirical Bayes moderation. Significance of log2FC was determined by using Benjamini–Hochberg multiple testing correction. TG-GATEs rat primary hepatocytes (RPH) and rat in vivo liver datasets have been analyzed following the same steps using suitable BrainArray CDFs (version 19, Rat2302 array version). Additional human hepatocytes datasets for uploading into the TXG-MAPr tool were first downloaded from Gene Expression Omnibus (GEO: https://www.ncbi.nlm.nih.gov/geo/): GSE83958; GSE45635; GSE53216; GSE74000; GSE13430; GSE104601. The first four datasets were processed similarly to the TG-GATEs PHH dataset, because the samples were analyzed with the same microarray platform (Affymetrix HGU-133 Plus 2). The GSE13430 and GSE104601 datasets were generated from the Agilent Human 1A Microarray (V2) G4110B and SurePrint G3 Human GE v2 8 × 60 K microarray platforms, respectively. These datasets were analyzed with GEO2R using the provided NCBI gene annotation. In the case of duplicated measurements (probes) for a single gene, the most significant probe (adjusted p value) was used for uploading the data.
TempO-Seq S1500 + set PHH data
The inter-individual variability in chemical-induced ER stress signaling and module activation was evaluated by using available TempO-Seq data of PHHs derived from 50 individuals which were exposed to a wide concentration range of tunicamycin (Niemeijer et al. 2021; Mav et al. 2020). In short, plateable cryopreserved PHHs derived from 50 individuals (KaLy-Cell, Plobsheim, France) were plated in 96 wells BioCoat Collagen I Cellware plates (Corning, Wiesbaden, Germany) at a density of 70,000 viable cells per well. Cells were allowed to attach for 24 h and treated with tunicamycin (Sigma) at a concentration range of 0.0001–10 µM for 8 or 24 h. After exposure, cells were lysed with 1 × TempO-Seq lysis buffer (BioSpyder) and stored at − 80 °C. Lysates were analyzed at BioSpyder (Carlsbad, CA, USA) using the TempO-Seq technology in combination with the S1500 + gene set (Mav et al. 2018). Data analysis is reported in Mav et al. 2020, here briefly summarized: experiments having library size of raw counts lower than 100,000 counts were filtered out. Raw counts were normalized with DESeq2 (R package version 1.28.1) normalization (Love et al. 2014), log2 transformed and analyzed with BMD express version 2.3 (Phillips et al. 2019).
Weighted gene co-expression analysis
To identify co-expressed genes from the PHH data, we used the WGCNA R package version 1.51 (Langfelder and Horvath 2008) and applied it to a matrix consisting of 941 rows (PHH experiments) and 17,500 columns (log2 fold change values for genes). Genes were included into the analyses when they showed FDR (BH) < 0.001 in at least one experiment. We created unsigned gene modules (i.e. grouping together co-induced and co-repressed genes), and selected the optimal soft power threshold maximizing both the scale-free network topology using standard power-law plotting tool in WGCNA (Langfelder and Horvath 2008) and the exclusion of genes having low gene intensity in the DMSO controls (determined via t-tests). We selected 5 as the optimal soft-power parameter. By further refining modules built using WGCNA function to merge similar modules (those having correlation of their eigengene values ≥ 0.8), we obtained 398 modules containing 10,275 genes. As described in (Sutherland et al. 2016), for each experiment we calculated the eigengene score (EGs, or module score) which summarizes log2 fold change of their constituent genes. Briefly, this protocol consisted of performing PCA on the gene matrix of each module, normalizing the log2FC across the entire dataset using Z-score conversion, the 1st principal component corresponds to the EGs. To simplify comparison between modules, the raw module score was normalized to unit variance (fraction between each module score and its standard deviation across the entire dataset) facilitating comparison across modules and across treatment. Therefore, the modules score indicates the level of activation or repression induced by a given treatment, when considering such changes in the context of the large collection of drug perturbations. Represented as Z-score, an eigengene score greater than + 2.0 or smaller than − 2.0 can be considered as a large (and relevant) perturbation in the context of the 941 experiments Although the EGs represent the overall activation or repression of the module gene members, each gene is ordered based on intra-modular connectivity with other genes. For every gene, we calculated the correlation between the log2 fold change versus the eigengene score of the parent module across the 941 experiments (termed ‘corEG’). The gene with the highest correlation between log2FC and module EGs (so called ‘hub gene’) is the most representative of the entire module matrix and show stronger connection to the other module genes. Both positive and negative correlation with the EGs, corEG, were allowed, thus genes that are inversely correlated can be members of the same module. A negative corEG indicates an inverse relationship between the gene log2FCs and its module EGs. The matrix containing EGs across all treatments was organized into a folded hierarchical tree (dendrogram), based on Ward’s hierarchical clustering of pair-wise Pearson correlations for each module across all treatment conditions. Module locations on dendrogram branches were identified with a hierarchical anti-clockwise nomenclature system (Supp. Fig. S1a). Some module branches were artificially elongated to separate module clusters and improve visualization (Supp. Fig. S1b). Compound correlation was calculated with Pearson and Spearman correlation using the module EGs across the entire set of treatment conditions (resulting in 941 instances); similarly, module correlation was calculated across all modules (resulting in 398 instances). Preservation between the modules structure obtained with the PHH TG-GATEs data set and the RPH TG-GATEs and rat in vivo liver data sets has been performed as follows: (1) rat gene IDs have been converted to human gene IDs with the Rat Genome Database, which provided one-to-one ortholog matches (https://download.rgd.mcw.edu/data_release/RAT/ORTHOLOGS_RAT.txt, Smith et al. 2020), (2) preservation statistics have been calculated with the WGCNA R package and thresholds for interpretation were adopted by relevant literature (Langfelder et al. 2011). Briefly, a module showing Z summary >= 2 is considered moderately preserved, >= 10 highly preserved. A lower median rank indicate higher preservation.
Pathway mapping and enrichment analyses
Over Representation Analysis (ORA) was performed via Consensus Pathway DB (cpdb version 34), including the following databases: BioCarta, EHMN, HumanCyc, INOH, KEGG, NetPath, Reactome, Signalink, SMPDB, Wikipathways, UniProt, InterPro (Kamburov et al. 2013). GO enrichment was obtained with the R package topGO version 2.26.0, algorithm = "classic", statistic = "fisher" (Alexa and Rahnenführer 2007). For both resources, we included in the tool enriched terms satisfying hypergeometric test p value < 0.01. However, for module interpretation (Table 1 and Table S8) only the top 10 associated terms per modules were included, that corresponded to terms having always adjusted p value (FDR) for CPDB lower than 0.05. String DB (Szklarczyk et al. 2015) was used to obtain protein–protein interaction networks of specific modules. Edge weights are proportional to the combined score of the nodes that the edges connect (Szklarczyk et al. 2015). Graphical rendering was obtained with Cytoscape (Shannon et al. 2003) using the StringApp (http://apps.cytoscape.org/apps/stringapp) bridge component.
Transcription factor (TF) enrichment and TF activity scoring
A hypergeometric test was performed on gene members in each WGCNA module to identify its regulatory TFs using the function phyper within the stat package in R (version 3.5.1). The gene set of TFs and their regulated genes (regulons) are derived from DoRothEA (version 2, http://www.github.com/saezlab/dorothea/releases, Garcia-Alonso et al. 2019) with two sets of confidence levels: the “high confidence” level comprises categories A, B and C while the “high coverage” level comprises categories A, B, C and D. The enriched TFs with p value less than 0.01 were included in the study. In parallel, TFs’ activities were estimated as normalized enrichment scores using the function viper from the viper package (version 1.16.0, Alvarez et al. 2016) with two confidence sets of TF-regulon from DoRothEA as described. All parameter settings were assigned as in the original DoRothEA study (Garcia-Alonso et al. 2019).
Web application TXG-MAPr
The user interface application of the TXG-MAPr tool has been implemented using the R-shiny package version 1.6.0 (Chang et al. 2021). The graphical part of the application has been implemented through the functionalities of the ggplot2, ape, igraph, hclust and pheatmap R-packages (Csardi and Nepusz 2006; Kolde 2019; Paradis and Schliep 2019; Wickham 2016). The credential system to log into the PHH TXG-MAPr tool is established using the shinyauthr package (https://github.com/PaulC91/shinyauthr). The PHH TXG-MAPr tool is available at https://txg-mapr.eu/WGCNA_PHH/TGGATEs_PHH/.
Uploading new data files into TXG-MAPr
In order to calculate new EGs for each module from external data, first a modified Z-score is calculated by dividing the gene log2FC by the standard deviation of the gene log2FC across all TG-GATEs conditions, and is further weighted by the correlation eigengene score (corEG) (Eq. 1). New EGs for each module are calculated by summing the Z-scored gene log2FC values of the module genes (n = number of genes in a module), normalized by the standard deviation of the raw module score in the TG-GATEs dataset (Eq. 2). When data of certain genes in a module is not available in the uploaded dataset, then the Z-score for that gene is assumed to be zero, which may create an underestimation of the final module EGs:
The new module EGs will be overlaid onto the PHH TXG-MAPr dendrogram and will be fully integrated into the web application for that particular session. Data will be removed when the session is closed.
Cluster correlation of the expanded seed module set
Figures were made using R and the packages ggplot2, pheatmap, igraph. Heatmap in Fig. 3 is obtained with the following steps: (1) selection of ‘seed modules’ i.e., modules having clear interpretation based on term enrichments and TFs associations (see Table 1). Because TG-GATEs data is a sparse matrix, i.e. not all the compounds have been tested at all concentration levels and time points, we removed the 2-h experiments (missing for roughly 50% of the compounds) and applied imputation of the remaining missing value with Singular Value decomposition via the R package pcaMethods using the svdImpute function (Stacklies et al. 2007; Troyanskaya et al. 2001). (2) Pearson correlations between seed modules and all other preserved modules were calculated considering all remaining EGs from TG-GATEs (119 modules × 779 experiments). Modules were added in (termed ‘add-in modules’) if the (seed x target) correlation values were ≥ 0.7 to yield the ‘expanded set’ of 87 modules. Hierarchical clustering was obtained by applying the Ward D2 method, using 1-cor as distance. In particular, we applied the WardD2 method, including the full implementation of the Ward clustering criterion. Note that the results of Ward’s agglomerative clustering are likely to delineate clusters that visually correspond to regions of high densities of points in PCA ordination, e.g. produce ‘round’ clusters (Murtagh and Legendre 2014).
Targeted genes set analysis (S1500 +)
To evaluate the quality of the genes belonging to the BioSpyder TempO-Seq S1500 + set, we tested for difference of mean absolute corEG with a random draw of genes from the complete pool of genes available in the TG-GATEs dataset. The randomly selected gene set have the same size as the overlap of the S1500 + set with the TG-GATEs gene set (n = 1830) and this process was repeated 10,000 times (Wilcoxon test), adjusting the p value with Bonferroni method. The heatmap in Fig. 5 was obtained with the “complete” clustering algorithm applied to the Euclidean distance between samples. Donors module clusters were obtained with the same method as for TG-based module clusters (see paragraph above). The overlap between TG module clusters and donor module clusters was calculating using the WGCNA R package (Langfelder and Horvath 2008) using the overlap function, which calculates the numerical overlaps between groups and quantify the significance by applying Fisher exact test. Donor traits (phenotypes) associations with modules responses after treatment of donor hepatocyte cultures were determined as previously described in (Sutherland et al. 2018). In brief: (1) different concentrations were analyzed separately, and only after 24 h of exposure (generally higher transcriptomic changes). (2) For cluster to traits associations, cluster behaviors were calculated as mean of the EGs scores of the modules’ members of each cluster. Since only modules with high correlation populate a cluster, we expect that averaging their EGs would lead to a concordant score. To understand driver-modules for each association, we calculated module-to-trait relationships. (3) The association between cluster (or module) and the occurrence of a donor trait was quantified using Cohen’s d, a measure of effect size. Since the average absolute module eigengene (avgAbsEG), a measure of overall transcriptional activity, can have an effect size for traits associations, logistic regression was performed to determine the contribution of a given module in explaining the residual odds of toxicity after accounting for avgAbsEG, and the significance of the module represented as adjusted p values (padj).
The PHH TXG-MAPr: a tool to visualize and explore biological interpretation in primary human hepatocyte transcriptomic data
We applied WGCNA to the TG-GATEs primary human hepatocytes (PHH) dataset (Igarashi et al. 2015), which includes exposure data for 158 compounds at various time points and concentration levels to identify the predominant networks of co-expressed genes (Fig. 1a exemplifies the process for a subset of co-expressed genes). We obtained 398 networks (modules) of highly correlated genes. Modules are labeled with a number inversely proportional to the number of member genes; larger modules have smaller number labels. An eigengene score (EGs) was computed for each module as the first principal component of each module data matrix and further scaled (see Material and Methods). Thus, the EGs represents the trend (induction or repression) of the entire module based on the included genes. By performing WGCNA, the final data matrix was reduced to 941 columns (treatments) and 398 rows (module EGs) which corresponds to a 97.7% reduction in dimensionality of gene expression.
To facilitate the analysis of transcriptomic information from module responses, we developed a module-based R-Shiny visualization and analysis framework, the PHH toxicogenomic (TXG) MAPr (PHH TXG-MAPr). The EGs-treatment data matrix was organized into a folded hierarchical tree, or dendrogram, based on Ward’s hierarchical clustering of pair-wise Pearson correlations for each module across all treatment conditions (Fig. 1b, Supplemental Fig. S1a, b). The dendrogram allows one to visually appreciate the induction or repression of modules EGs (red to blue color scale, respectively) and the proximity to modules with similar behavior (Pearson R; see Supp. Fig. S2 for dose– and time–response dendrograms for example compounds). We also incorporated module enrichment measures for biological pathways and ontologies, transcription factor-target pairs, compound similarity analysis and a variety of other useful functions for mechanistic analysis and data mining.
To illustrate the utility of the PHH TXG-MAPr analysis framework, we investigated the response of cyclosporine A (CSA), which was transcriptionally active in PHH and has both mild hepatotoxic and more severe renal toxicity potential (Rezzani 2004). Strong induction of modules PHH:13 and PHH:62 can be seen 24 h after treatment with 6 µM CSA (Fig. 1b). Using the compound correlation functionality, we sampled all pair-wise Pearson correlations for all 398 module EGs across all treatment conditions; CSA shows highest similarity to tunicamycin (Pearson R = 0.84), suggesting a common mode of action (Fig. 1c, center). Modules at the extremities of the correlation plot (PHH:13 and PHH:62 in the example) can be highlighted and tabulated to facilitate further analysis. Correlation between gene log2FC from the same treatment conditions had lower similarity (Pearson R = 0.74; Fig. 1c, top). On average, across all TG-GATEs compounds the EGs-based max Pearson correlations were higher than max correlation based on gene log2FC (Supp. Fig. S1c), suggesting that analyzing gene networks (modules) improves the robustness of compound comparison of transcriptomic data by averaging out the variations of individual gene perturbations. Although PHH:13 and PHH:62 are in different branches they did show a Pearson correlation of 0.62 for all TG-GATEs treatment conditions (Fig. 1c, bottom).
Tunicamycin (TUN) is a prototypical inducer of ER stress while there is literature reporting that CSA also perturbs ER functions (Foufelle and Fromenty 2016; Van Summeren et al. 2013; Vickers et al. 2017). Therefore, we investigated modules PHH:13 and PHH:62 for genes associated with ER, ER stress and unfolded protein response (UPR). Module PHH:13 contains 121 genes, including well-known ER stress genes, like HSP90B1 (GRP94), SEL1L, PDIA6, HSPA5 (BiP/GRP78) and its co-chaperone DNAJB9 (Suppl. Table S2). Since PHH:13 was too large to visualize in a gene interaction plot, Fig. 1d (top) shows the module plot for the smaller module PHH:62 (19 genes). This module contains genes involved in endoplasmic reticulum-associated degradation (ERAD), such as SELENOS and SELENOK, but also other ER resident genes with less clear connection to ER stress yet, such as FKBP2, SEC11C and SSR4 (Fig. 1d, top). There is a clear time- and dose-dependent induction of module PHH:62 by CSA, which follows a similar trend for the log2FC induction of the hub-like (highest corEG) module genes (Fig. 1d, center and bottom).
Using the PHH TXG-MAPr module enrichment functionalities, we investigated the enrichment of biological processes and pathways (see Methods) as well as transcription factors (TFs) enrichment using the DoRothEA gene set resource (Garcia-Alonso et al. 2018). Results were overlaid on the module dendrogram (Fig. 1e). Not surprisingly, modules PHH:13 and 62 showed the lowest p-values for GO-CC term “endoplasmic reticulum” as well as terms related to endoplasmic reticulum (ER) stress amongst others (Fig. 1e, top). Module PHH:13 was also enriched for transcription factor ATF6 target genes (Fig. 1e, center) consistent with the presence of more highly annotated ER stress genes. To determine if PHH:13, the primary module enriched for ATF6 gene targets, reflected general ATF6 activation, we calculated the ATF6 activation scores for all ATF6 target genes in the 10275 TXG-MAPr genes (see more details on DoRothEA in the “Materials and Methods” section) and we observed that ATF6 scores also showed time- and dose-dependent activation for both CSA and TUN (Fig. 1e, bottom).
Using the PHH TXG-MAPr analysis framework, we were able to rapidly identify the activation of an ATF6 regulated ER stress response as an early event following cyclosporine A exposure. Data underpinning these functionalities, and others not noted here, are accessible in a tabular format in the supplementary materials (Supp. Tables S1–S7). The dedicated PHH TXG-MAPr application is available at https://txg-mapr.eu/WGCNA_PHH/TGGATEs_PHH/.
Module preservation in the PHH-TXG-MAPr
Preservation statistics can be used to determine if networks’ node-edge relationships defined in one biological system are preserved in another (Langfelder et al. 2011). We evaluated network preservation of PHH modules versus rat primary hepatocyte (RPH) and in vivo rat liver TG-GATEs datasets using two different preservation statistics: Z-summary and Median Rank (see Methods, Fig. 2a, b, Supp. Table S8, Supp. Fig. S3). Z-summary preservation statistic shows higher dependency on modules size, as noted in literature; large modules such as PHH:13 are highly preserved (Fig. 2a, color scale) (Langfelder et al. 2011). The complementary Median Rank statistic (Fig. 2b) is less dependent on module size, and can assign high preservation ranking (quantified with low numerical scores) to small modules. Therefore, we considered preserved modules as those with Z-summary preservation score for both systems (RPH and rat liver) higher than 2 or ranked in the top 100 modules for the Median Rank statistics for both systems (for the latter, arbitrary threshold corresponding to 25% of all the PHH modules). This selection led to a set of 102 (26%) preserved modules from the PHH TXG-MAPr (Supp. Table S8), which include the UPR modules PHH:13 and PHH:62.
Stress response pathways represented by PHH modules
Pathway information facilitates mechanism-based risk assessment (Krewski et al. 2020). Therefore, we used the TXG-MAPr enrichment functionalities to identify modules representing cellular stress response pathways typically involved in toxicity and activated by stress-responsive TFs. These include the ER stress response and the larger integrated stress response (ISR), as well as inflammatory pathway, mitochondrial response, oxidative stress and the DNA damage responses, all of which were previously shown to be relevant for identifying the modes of action of compound-induced toxicity (Wink et al. 2014). We focused on modules which (1) were identified by consistent enriched terms and TF associations with p values < 0.01 and FDR < 0.05, and (2) show response to prototypical compounds (Fig. 2c, Table 1). Below are the selected stress responses pathways represented by PHH modules which we aim to highlight in this study:
ER stress and ISR
As noted above, modules PHH:13 and PHH:62 are enriched for terms associated to endoplasmic reticulum localization, UPR and ERAD as well as transcriptional regulation by ATF6, all of which are components of the larger ISR (Hetz et al. 2020; Ron and Walter 2007). We also looked for modules regulated by ATF4, a transcription factor downstream of the ISR hub, and IF2α, a regulator of translation after stresses including ER stress, starvation, and viral infection (Pakos‐Zebrucka et al. 2016). Modules PHH:15 and PHH:295 were enriched for ATF4 gene targets, and terms associated with amino acid metabolism and transport, respectively (Table 1), which are processes that are regulated by ATF4 (Hetz et al. 2020). All four modules respond to cyclosporine A and tunicamycin (Fig. 2c, panel i).
Heat-shock, proteasome and lysosome
In response to proteotoxic stress, damaged proteins are bound to members of heat shock-inducible chaperone system (heat shock response), which facilitate removal of damaged proteins by lysosomes and/or proteasome degradation. Modules PHH:177, PHH:76 and PHH:82 (proteasome) and PHH:131, PHH:95 (heat shock) are preserved and respond to treatment with allyl alcohol (Mandrekar et al. 2008) (Fig. 2c, panel ii, Table 1). Module PHH:17 (annotated for GO:CC lysosome and includes SQSTM1) is also found to be preserved and is activated by fluoxetine, a known phospholipidosis inducer (Breiden and Sandhoff 2019) (Fig. 2c, panel ii, Table 1).
Immune response pathways including inflammatory mediators are activated in hepatocytes during liver injury and in disease states (Campos et al. 2020; Woolbright and Jaeschke 2017). Immune response and inflammation terms are enriched in several preserved modules (PHH:12, PHH:247, PHH26, PHH:22 and PHH:136) that respond to inflammatory agents such as interferon-α and TNF-α (Fig. 2c, panel iii, Table 1). Modules PHH:242, PHH:44 and PHH:70 are also annotated for immune response but not preserved (Table 1). Compound regulation and module enrichment terms identify subgroups of immune response modules. STAT signaling modules (PHH:12, PHH:247) are strongly induced by interferon, while NF-kB signaling modules (PHH:22, PHH:136) are mainly induced by TNFα. Module PHH:26 is induced by both stimuli, but also has a mixed NF-kB and STAT annotation.
Mitochondria are an important target for hepatotoxic chemicals and mitochondrial injury is commonly used to screen for hepatotoxic potential (Rana et al. 2019; Weaver et al. 2020; Yang et al. 2015). Mitochondrial damage is commonly assessed by changes in respiration and the mitochondrial membrane potential. Less is known about the regulation of genes coding for mitochondrial proteins in response to toxic stress. Several preserved PHH modules are found to be annotated with mitochondria-related and mitochondria-component specific terms (PHH:113, PHH:138, PHH:2, PHH:256, PHH:33, PHH:97). These modules respond positively to doxorubicin (Osataphan et al. 2020) and are found to be repressed by ethionine (Zhang et al. 2020) (Fig. 2c, panel iv, Table 1). Notably, PHH:33 and PHH:97 were annotated by terms relating to regulation of mitochondrial ribosome translational control and mitochondrial organization. Module PHH:173 shows enrichment for mitochondria genes, but is not preserved (Table 1).
DNA-damage and oxidative stress
Modules PHH:59 and PHH:83 are enriched for TP53-regulated genes and annotated for terms consistent with the DNA damage response (DDR). The NFE2L2 (NRF2) associated modules PHH:144 and PHH:325 include key genes such as SRXN1 and NQO1 which are direct targets of NRF2, and enrich for terms such as oxidoreductase and glutathione metabolism. In contrast to the other preserved modules from Table 1, modules associated with NRF2 and TP53 activation are not well-preserved, but these modules do respond as expected to specific inducers consistent with interpretation as oxidative stress and DNA damage modules, respectively (Fig. 2c, panels v and vi and Table 1). Only modules PHH:243 (includes GADD45A/B) and PHH:337 (associated with NFE2L2) are found to be preserved (Table 1).
The examples described above exemplify the utility of PHH TXG-MAPr for identifying modules relevant to known stress pathway useful for mechanistic interpretation and benchmarking preservation of co-expression across species and liver in vivo. Other cellular processes and pathways not discussed can be identified using the main enrichment terms and TFs associations of the whole preserved module set (Table S8).
PHH stress response module seeds cluster in an interaction map to describe mechanisms of toxicity
In toxicity pathways, early events are coupled to a cascade of perturbations that lead to an adverse outcome, e.g. cell death. Using the stress pathway modules as seeds, we created an interaction map (cluster correlation) between stress pathways and other highly preserved modules that cluster important cellular processes as well as preserved modules without clear annotation and identify off-diagonal interactions between important biological themes.
We used the well-annotated modules identified above (Table 1 and Table S8) as seeds, and identified preserved ‘add-in’ modules (with or without clear enrichment) if they had a Pearson correlation ≥ 0.7 with a seed module, calculated across all TG-GATEs PHH compound treatments. The final expanded set of 87 seed and add-in modules is in Table S8, identified by the column “Cluster”. Figure 3 shows a cluster correlation analysis of the expanded seed set module EGs based on all TG-GATEs compounds, together with representative compound responses. Eight distinct clusters emerged with modules annotated for similar processes forming macro groups; examples are described below. Note that within each group, modules with limited to no annotation have been included, suggesting they belong to the same response group.
Cluster 1: immune response
Cluster 1 includes all the seed modules annotated for immune response pathway terms (Table 1), except PHH:247 (cluster 8). Add-in module PHH:317, which showed no annotation terms, contains an interleukin receptor (IL18R1), but also some genes induced by inflammatory molecules, for which little is known regarding involvement with immune response (GSAP, CSTO). Similar to the observations obtained with TNF-α and interferon-α, inflammatory modules clusters into two subgroups, one enriched with NF-κB modules, the other in STAT regulated modules.
Cluster 5: mitochondria
Cluster 5 contained seed modules annotated with mitochondria related terms, but also add-in modules with less clear annotation. Nevertheless, when looking for protein–protein interactions among genes belonging to this group of modules, high degrees of connections are found for a large subgroup of genes, connecting within modules and showing high corEG (Wilcoxon rank sum test p = 7.72E-07, Fig. S4).
Cluster 3, 4 and 8: cellular stress responses
The above characterized stress modules, like oxidative stress, ER stress response, heat shock and DDR, clustered in isolated sub-groups within clusters 3, 4 and 8. However, these clusters also included other processes less commonly associated with prototypical stress responses. For example, Cluster 3 contained the seed proteasome module (PHH:76), a module containing two growth arrest and DNA-damage inducible (GADD) genes, GADD45A and GADD45B (PHH:243), the ATF4-regulated modules (PHH:15 and PHH:295), a module containing several inducible heat shock protein transcripts (PHH:131) and a module containing two NRF2 target genes, MAFG and MXD1 (PHH:337). Cluster 3 also contained two RNA processing modules (PHH:42, PHH:19) associated with the nucleolus/ribosome biogenesis and mRNA processing, respectively. Cluster 8 contains the seed DDR modules (PHH:59 and PHH:83), both of which contain multiple TP53 regulated genes, and a seed mitochondrion module (PHH:2) along with modules enriched for fatty acid/cholesterol metabolism modules (PHH:40, PHH:240) but also cellular homeostasis modules, like PHH:1 (cell cycle), and PHH:11 (metabolism).
The expanded interaction map describes the interplay between cellular processes in response to chemical treatment
The interplay across module clusters appeared to capture elements of the dynamic interactions among biological response networks (off-diagonal correlation in Fig. 3). Clusters can interact positively, e.g. the mitochondria cluster 5 and the RNA processing cluster 6 (which contains both ribosome RNA (rRNA) and mRNA processing modules), or negatively, e.g. activation of stress cluster 3 is commonly associated with down-regulation of modules in cluster 8 (cell cycle and mitochondrion). Interestingly, half the modules, largely those annotating for NF-κB in immune response cluster 1, showed positive correlation (induced) with activation of processes in clusters 2 and 3, while the other half, including the STAT modules, were inversely correlated and down-regulated. Cluster 1 showed a similar bifurcation with cluster 8 but in the opposing direction, i.e. the NF-κB modules positively correlated with clusters 2 and 3 were negatively correlated with cluster 8 while the STAT modules showed a positive correlation.
An integrated understanding of interaction among complex cellular processes is critical for a mechanism-based risk assessment and requires identifying key event associations with an apical endpoint. Compound with higher versus lower risk may perturb more cellular processes in progressing to cytotoxicity while others may have a more adaptive phenotype. Evidence for progressive perturbation (or lack thereof) can be derived within the module cluster correlation landscape by comparing the module activation by exemplar compounds (Fig. 3, right). For example, tunicamycin is a prototypical ER stress inducer and has a low cytotoxic concentration in human hepatocyte cells and a low LD50 of 2 mg/kg in mice (Morin and Bernacki 1983; Zhang et al. 2014). Tunicamycin clearly activates the ER stress response in PHH both via the ATF6 arm in cluster 4a (PHH:13 and 62), the ATF4 arm (cluster 3) and fatty acid metabolism (cluster 8 and cluster 4), concurrently with strong repression of rRNA and mRNA processing (cluster 6) and immune response activation (cluster 1). In contrast, cyclosporine A, which is less toxic with an LD50 in mice ranging from 96 to 2803 mg/kg (depending on the route of administration (Sax 1975)), primarily triggered the ATF6-ER response (cluster 4a). Taken together, these observations suggest that tunicamycin is inducing a more severe response impacting diverse cellular biological processes, whereas cells challenged with the used concentrations of cyclosporine A showed limited module perturbations.
A different picture emerged when we considered two prototypical drugs that can induce oxidative stress: acetaminophen and omeprazole. For both, immune responses were primarily repressed (cluster 1), while oxidative stress modules (cluster 4) were activated. In addition, omeprazole activates modules in cluster 3, concurrent with a deactivation of modules in cluster 8. Specific transcription regulation modules are activated, including PHH:4 which contains several stress-induced TFs (ATF4, ATF3, NFE2L2, TP53) and the xenobiotic stress module, including CYP enzymes (cluster 4). Interestingly, PHH:358 (cluster 4) contains CYP1A1 and CYP1A2, canonical AHR targets, and omeprazole is a known AHR ligand (Safe et al. 2020). Taken together, these observations suggest that omeprazole activates oxidative stress, coupled with the activation of processes to constrain protein damage (ATF4, proteasome and heat shock modules in cluster 3) and repression of fatty acid metabolism and cell cycle progress (cluster 8).
These examples illustrate the utility of the PHH TXG-MAPr stress-pathway landscape for mechanistic evaluation of compounds and for comparing compounds with similar modes of action but with different degrees of toxicity.
Visualizing external datasets in the TXG-MAPr tool reveals good correlation between different cell types and gene expression platforms in the mode of action
Integrating historical and new datasets to derive mechanistic interpretations is important, but it is often hard to determine if joint information, such as WGCNA modules, provides a consistent view of biological responses across experiments. The high variability of gene-level analyses, and the high penalty for multiple comparisons also complicate data integration. To validate the utility of the PHH TXG-MAPr for integrating and comparing external datasets, we created a data upload function, which calculates new module eigengene scores (EGs) from the gene log2FC of the uploaded data and overlays the scores onto the module dendrogram for visualization (for further details, see Material and Methods paragraph “Uploading new data files”). To further describe CSA mode of action, we processed and uploaded samples from the GEO database of CSA prolonged repeated exposure followed by recovery in collagen sandwich cultured PHH (GSE83958 (Wolters et al. 2016)) into the PHH TXG-MAPr tool (Fig. 4a). The uploaded CSA dataset showed strong correlation (Pearson R > 0.8) in module activation and repression among the three timepoints. A heatmap of module EGs (absolute EGs > 2) separates repressed and activated modules in different clusters (Fig. 4b). ER stress annotated modules (PHH:13 and PHH:62) and an ATF4 module (PHH:15) are located in the same cluster, which are strongly activated by CSA at high dose levels (Fig. 4c). In addition, the 5 day 30 µM CSA treatment with 3 days recovery looks most similar to the CSA_24h_6 µM data from TG-GATEs based on module EGs correlation (R Pearson = 0.67, Fig. 4d). The absolute Pearson correlation between module EGs across the uploaded CSA conditions was higher than the absolute correlation between the gene log2FC (Supp. Fig. S5f). The uploaded data confirm that high dose cyclosporine A induces a strong ER stress response amongst others, which is present for several days during daily exposure and persists even after washout. A similar ER stress response (PHH:13 and PHH:62) was also present in HepG2 hepatocytes exposed to CSA (GEO dataset: GSE45635 (Van den Hof et al. 2015)) and ER stress module activation was seen at low (3 µM) and high (20 µM) dose (Suppl Fig. S5a, S6 cluster 5 red line).
Additional datasets of human hepatocytes (PHH, HepG2 and HepaRG) exposed the oxidative stress inducer acetaminophen (APAP) were analyzed to investigate the application of the PHH TXG-MAPr for a different compound (Suppl. Fig. S5b-e). The APAP datasets show a clear oxidative stress response (PHH:144, PHH:325) as well as many other module perturbations, which is independent of the cell type or microarray platform (Fig. S6, cluster 8 red line).
Pearson correlations between CSA and APAP treatments in TG-GATEs and the uploaded data from GEO are shown in a cluster correlation plot (Fig. 4e). Hierarchical clustering based on correlation distance between samples separates the data based on the mode of action, clustering ER stress inducers cyclosporine A and tunicamycin together and separated from oxidative stress inducer acetaminophen. In addition, different hepatocyte cell cultures (3D PHH, HepG2 and HepaRG) show good correlation with the same compound exposure, suggesting that a similar mode of action can be captured by other hepatocyte cells. GEO data of PHH exposed to APAP that was analyzed on two different Agilent microarray platforms correlate well with the TG-GATEs data at the same concentration (GEO_PHH_APAP_24h_5mM / GEO_PHH.3D_APAP_24h_5mM versus TG_PHH_APAP_24h_5mM), even though the average module coverage was 84 or 93% for the Agilent chips (e.g. only 84 or 93% of genes per module were measured in the chip). This indicates that the TXG-MAPr supports the analyses of gene expression data from different platforms/chipsets. In contrast, low dose or early time-points do not correlate well with the same compound, likely due to the minimal module perturbations in these conditions (data not shown).
Overall, the PHH TXG-MAPr upload and analysis functions are powerful tools to compare both global biological responses across disparate datasets and TG-GATEs data, while identifying specific modes-of-action, even for other hepatocyte cell types or other microarray platforms, suggesting that co-expression analysis can obviate technical issues when comparing across expression datasets or platforms.
Leveraging WGCNA to interpret targeted RNA Seq datasets
Although high throughput technologies allow many samples to be processed in parallel, the high costs of whole transcriptome profiling has led to the use of targeted RNA sequencing approaches which are gaining popularity for toxicology studies (Mav et al. 2018). We reasoned that mapping a targeted gene sets to co-expression modules might yield sufficient data to impute more detailed and biologically relevant information. We tested this concept using the PHH TXG-MAPr by uploading a targeted RNA Seq data set, TempO-Seq, obtained from treating plated cryopreserved PHH from 50 different human donors in a detailed dose–response protocol with tunicamycin at two time points (0.0001–10 µM, 8 and 24 h, Niemeijer et al. 2021). We calculated the modules EGs for each donor-concentration–time point combination, a total of 600 different conditions, following the approach detailed in “Material and methods” section.
Of the 2708 genes measured in the TempO-Seq S1500 + gene set, only 1830 genes are mapped to the PHH modules (Supp. Table S9). Genes not mapping were excluded from the PHH network during modules generation because they did not have robust co-expression patterns and are included in the PHH TG-GATEs ‘gray’ module (Langfelder and Horvath 2008). Although some modules were not covered by any genes from the S1500 + set (coverage of 0%), modules selected in Fig. 3 have significantly higher coverage (Wilcoxon test, greater alternative, p < 0.05 and Fig. 5a, b), suggesting that the preserved and well annotated modules are better represented in the S1500 + set. In addition, the coverage was higher when the hub gene was present in the S1500 + set (Fig. 5b). EGs calculation depends on both the fold change and the correlation with the eigengene score of each gene (corEG, a measure of hubness), therefore we assessed the corEG values for S1500 + genes as a gene quality metric. At the module level, the average corEG does not seem to be heavily impacted by the subsampling of gene space, and modules in the expanded seed set (Fig. 3) show higher concordance (Fig. 5c, Pearson R 0.84 versus Pearson R 0.65 including all modules). Additionally, the S1500 + set shows higher average corEG compared with random draws of genes from PHH modules (adj. p value < 0.05, see Material and Methods “Data analysis” section). We also assessed coverage within the set of the preserved modules; only two show coverage of 0%, PHH:103 and PHH:105 (Supp. Table S10). Thus, the S1500 + set represented a set of genes that overlap, to a greater or lesser extent, nearly all preserved modules and higher quality relative to their corEG values.
To test the impact of calculating EGs based on the S1500 + gene set, we recalculated EGs for all TG-GATEs samples using only the reduced S1500 + gene set and calculated the Pearson correlation per module compared to EGs calculated using the complete gene set (Fig. 5d). Modules even with minimal, but above zero, gene coverage show good EGs correlation (always higher than 0.5), and modules in the expanded seed set (Fig. 3) show on average higher correlation (Wilcoxon test, p value of 2.72e-09). Using the ER stressor tunicamycin as a model compound, we performed a cluster correlation analysis using EGs for TG-GATE PHH data, calculated using both whole chip or just the S1500 + genes, and compared results with a set of TempO-Seq, S1500 + data derived from 50 donor PHH cultures treated with tunicamycin (Fig. 5e, Supp. Figure 8). The Affymetrix and S1500 + derived scores for TG-GATEs tunicamycin treatments were always showing correlation > 0.7 and clustered together. However, there was considerable variation within the donor set at the highest dose (10 µM). A subset of donors shows higher similarity within themselves and with TG-GATEs tunicamycin samples, especially for medium concentration and when calculated with the S1500 + gene set (Fig. 5e, donors in upper left quadrant, to compared to TG-GATEs sample in the right bottom quadrant). Another subset of donors showed lower similarities within each other and lower similarity with TG-GATEs tunicamycin samples (Fig. 5e, lower right quadrant).
Thus, a targeted gene set can be used as input to derive EGs for modules derived from the whole transcriptome and impute biological responses on EGs.
Assessing donor variability captured by PHH modules
Since modules with similar biological annotations cluster together based on EGs scores (Fig. 3) we reasoned that identifying sources of variability at the level of clusters might be more robust than at the individual module level. Therefore, we clustered modules based on the S1500 + EGs of the 50-donor dataset similarly to Fig. 3, and assessed the overlap between clusters (Fig. S9). Some, but not all, modules clusters derived from the donor data show significant overlap with the groups obtained using the entire TG-GATEs set in Fig. 3 (Fig. S9b, exact Fisher p values and numerical overlaps are shown). In particular, RNA processing, stress clusters (ATF4 and ATF6) and transcriptions regulation modules are clustering similarly. The modest overlap is not surprising since the donor dataset is derived with a single compound with the PHH donors as the primary source of variability (Fig. 5e).
We then made use of modules EGs of S1500 + clusters 1–8 to study donor variability in response to tunicamycin. Donor variability can be attributable to genomic makeup, but also lifestyle, disease and age status of each individual. Consequently, we analyzed relationships between module scores, reflecting induction or repression of underlying genes, and the presence (positives) or absence (negatives) of donors’ traits, i.e. sex, the presence of cancer, liver pathology, hypertension, diabetes, smoking habit (Supp. Table S11).
We applied the methodology described in (Sutherland et al. 2018) and averaged modules EGs within each module clusters and modeling each concentration separately. We calculated Cohen’s d effect sizes (eff, (Cohen 2013)) and performed logistic regression treating avgAbsEG (average absolute cluster score) as a covariate and quantified the p value for each module in explaining the odds of toxicity (signed_log10_p_adj) reflecting the adjustment for differences in overall gene expression, i.e. avgAbsEG (Supp. Table S12). Next, we calculated the same relationship for individual modules EGs (Supp. Table S13).
Following this approach, we were able to identify modules and the underlying biological processes, which are uniquely associated with a donor’s trait upon tunicamycin exposure, and with increasing association levels in a dose–response relationship (Fig. 6). Module clusters have variable number of genes included but still show good gene quality as quantified by the high average correlation with EGs calculated with the entire TG-GATEs gene set, especially compared to modules excluded by these clusters (Figs. 5d and 6). We focused on the presence of liver pathology, that influences donor’s response based on different processes both positively and negatively (Fig. 6a). Modules annotated for DDR (PHH:83), protein folding (PHH:95) and oxidative stress (PHH:325 and PHH:144) are positively associated (induced) with liver pathology phenotype of the donor (Fig. 6a, Supp. Table S12, Supp. Fig. S10). Modules annotated for immune response had a negative association (repressed) with liver pathologies (PHH:12, PHH:26, PHH:44, PHH:22, Fig. 6a, Supp. Table S12 and S8 for module annotation). In particular, donors with liver pathologies show lower EGs for these modules, corresponding to lower log2FC values of the genes belonging to the modules, all annotated for being involved in inflammation and immune response (moderate and high negative Cohen’s d effect sizes, between − 0.4 and − 0.8, and adjusted p value < 0.05, Supp Table S13, Fig. 6b and Supp. Fig. S10c, showing genes in module PHH:12 with corEG > 0.8). To a closer inspection, genes with lower log2FCs are the result of already higher basal expression levels in PHH donors with pre-existing liver pathology (Fig. 6c, showing normalized counts for PHH:12 genes with significant difference between donors with or without liver pathologies, only with DMSO treatment, Wilcoxon test, adj. p value < 0.05).
To conclude, pre-existing traits of PHH donors influence the cellular response to an ER stressor (tunicamycin). Such an influence can be attributable not to different magnitude of ER stress response activation, but to variations in the activation of accompanying processes like immune response or oxidative stress.
In this work, we presented a novel Shiny web tool that allows users to apply a gene co-expression network (WGCNA) approach to investigate toxicity mechanisms rapidly and in the context of biological responses preserved between human and animals. This unbiased approach is not weighted toward current knowledge and can reveal unknown relationships between genes and phenotype data, but it also encompasses known biological response networks through modules annotation. In addition, this approach is tailored for a gold standard in vitro testing system: primary human hepatocytes (PHH).
Toxicogenomics data can be difficult to interpret due to its high dimensionality and possible low signal-to-noise ratio (Ideker et al. 2011). We reduced gene expression dimensionality from 104 to 102 by taking a modular approach and constructing networks of co-expressed genes from a large PHH transcriptomic dataset (Igarashi et al. 2015). A set of 398 modules are presented in an intuitive visualization format, the PHH TXG-MAPr. A simplified upload function enables calculation of eigengene scores (EGs) from external sources, allowing new data to be integrated and interpreted in the PHH TXG-MAPr analysis environment, which contains 158 chemical comparators combined in 941 different treatment conditions. Pathway enrichment and TF associations contribute to mechanistic interpretation of modules, allowing the user to identify key processes that are different and in common between compound-induced responses. This represents a starting point for mechanistic interpretation across risk assessment applications, for example, to derive data-driven quantitative AOPs based on molecular initiating and key events represented by modules’ EGs. Stress induced pathways interact in cascades of perturbations, balancing adaptive and progressive responses. We showed how modules and correlation between them can represent connected mechanisms (Fig. 3) and used these to describe compounds’ mode of action (MoA). Although correlation analysis is not sufficient to establish causality, it can suggest potential areas for additional analysis in order to define cause-and-effect relationship, e.g. through validation of a mechanism-based risk assessment framework (Liu et al. 2019). In addition, gene co-expression modules allows to understand how applicable a pathway is to other species (Perkins et al. 2015).
One of the most vexing problems in risk assessment is indeed understanding whether toxicology data are likely to extrapolate across species, e.g., from rodent to human, or from one model to another, e.g. from in vitro to in vivo. Co-expression analysis formalizes methods that allow comparison of node-edge relationships at the gene level (Langfelder et al. 2011). As a result, we were able to define a subset of modules, and the associated biological themes, with higher preservation between in vitro human and rat hepatocytes models. Interestingly, a number of preserved PHH modules show also overlap with published signatures of rat xenobiotic receptor activation (Podtelezhnikov et al. 2020): to mention a few, PHH:358 perfectly overlaps with the AHR signature, PHH:16 shows very high enrichment with gene members of the SREBP signature and prominent annotation for cholesterol biogenesis (Supp. Table S9). A remarkable observation of our results was the notable difference in preservation for some toxicologically relevant and well characterized stress pathways. For example, ER stress response is highly preserved, while DNA damage (p53-driven gene expression) and oxidative stress (Nrf2-driven) responses are not. Interestingly, when comparing module preservation results, we found that in vivo rat liver and RPH show higher similarities (Pearson R 0.85 and 0.60 for Z-summary and Median Rank, respectively). In addition, Nrf2/p53 modules show good preservation between PHH and HepG2 data (data not shown), suggesting the differences in module preservation are influenced by species rather than culture conditions. We expect that preservation results depend, to some extent, on the input dataset composition. For instance, how well a certain process is captured in gene expression data from the input experiments, which can influence gene-to-gene correlation patterns, could be a hypothesis to be explored in future studies. Nevertheless, literature reports that kinetics and activation of the non-preserved Nrf2/p53 processes are different across species (MacRae et al. 2015; Martin and Chang 2018; Monroe et al. 2020), suggesting that preservation of co-expression networks is a promising approach for the evaluation of the applicability domain of new approach methodologies (NAMs) for extrapolating safety assessments across species (Parish et al. 2020).
Comparison with external datasets is a crucial application of the PHH TXG-MAPr tool. Gene level comparisons can be risky given the likelihood of variability across models and the low signal-to-noise ratio. Within TG-GATEs data, we showed that compound correlations are improved at the module level (Fig. 1c, S1c, S5f). In addition, module EGs derived from microarray-based external GEO data shows high similarity across external and TG-GATEs datasets for the same compounds, suggesting that gene-level noise is dampened when results are evaluated at the modules level. Using the external GEO datasets, we also confirmed that ER stress is an early event in the mode of action of CSA in different liver cell types. In contrast, CSA did not show a good correlation with the MoA of APAP datasets, suggesting that oxidative stress is a secondary/late effect to high dose CSA exposure. Although several publications indicate that oxidative stress is involved during CSA induced cholestasis in liver (Kawamoto et al. 2017; Koido et al. 2020; Wolters et al. 2016), our analysis of CSA exposed hepatocytes using the TXG-MAPr tool showed that the ER stress should not be neglected in the mechanism of CSA induced liver injury as was also suggested by others (Koido et al. 2020; Van den Hof et al. 2015b). To validate the applicability of the PHH TXG-MAPr for other transcriptomic platforms, we showed that Agilent microarray data of APAP exposure in PHH showed strong correlation with Affymetrix based data in TG-GATEs.
Integrating and comparing data across platforms represent a great challenge, especially in cases where targeted gene sets have been designed to be representative for the whole transcriptome (Mav et al. 2018; Soufan et al. 2019). A reduced number of features is convenient to diminish complexity and costs, but can weaken the power of enrichment analysis given the smaller gene set background. Imputing the entire transcriptome response prior to analysis is one solution to this problem (Mav et al. 2020), but may not be tailored for the testing system in use. Using the TempO-Seq S1500 + gene set platform, we showed that EGs calculated with PHH co-expression modules can impute biological response patterns directly when analyzing and interpreting PHH data from targeted gene sets in the context of whole transcriptomic data. Although the targeted gene space is dramatically reduced compared to whole transcriptome data in TG-GATEs (~ 17%, 1830/10275 genes), the modules included in our collection of mechanistic relevant modules (Fig. 3) had significantly higher coverage compared to the entire module set, and include genes of high quality (as quantified by the corEG). Module EGs obtained with the complete gene set and with the S1500 + set are in good agreement, and the mechanistic relevant modules had significantly higher agreement than the rest. Taken together, these findings suggest that results based on only genes in the S1500 + platform can be extrapolated to recapitulate the whole transcriptome modules score variability.
Following this strategy, we circumvented the limited gene coverage of the S1500 + set by leveraging the correlation structure of modules to define biological responses relevant to the differential sensitivity across responses of 50 donors’ hepatocytes treated with an ER stress inducer, tunicamycin. Donors’ traits, particularly the presence of liver pathologies, show significant dose–response associations with clusters of modules referring to specific biological responses, namely immune, DNA damage, oxidative stress responses and protein folding. The presence of liver pathologies seems to be associated with a weakened activation of innate immune response upon ER stress induction. Interestingly, genes involved in innate immunity in donors with pre-existing liver pathologies show already higher expression values without treatment compared to donors without liver pathologies. Under conditions of excessive stress, ER stress evokes inflammatory responses via the activation of the three UPR branches, which may sensitize to adverse events (Duvigneau et al. 2019; Fredriksson et al. 2014; Peng et al. 2020). Donors with pre-existing liver diseases show already activated inflammation, therefore possibly resulting in even higher sensitivity. However, only innate immune response could be captured by the PHH model, as it includes isolated primary hepatocytes, but no other cellular components of the liver that contributes to the immune response. Modules related to DNA damage response, protein folding and oxidative stress response appear to be mildly sensitized in patients with liver pathologies upon ER stress induction. That is in accordance with a condition of chronic stress and continuous attempt to restore homeostasis (García‐Ruiz and Fernández‐Checa 2018). Our results partly overlap with findings in recent GWAS DILI studies, where cholestatic injury phenotype has been associated with UPR and mitochondrial stress, but most importantly with an increased sensitivity to ROS upon additional drug treatment (Koido et al. 2020). Interestingly, ER stress response did not show a connection with donor traits: ER stress related modules, collectively, were a stable and preserved set of biomarkers for the unfolded protein response. Ultimately these findings highlight how pre-existing liver diseases can be a confounding factor when interpreting data from in vitro models using donor hepatocytes and that conserve alterations even in 2D culture conditions.
In conclusion, the TXG-MAPr takes advantage of the modular nature of gene co-expression networks to achieve mechanistic relevant, cross-species and cross-platform evaluation of toxicogenomic data. Practically, we envision a multifaceted application of the TXG-MAPr. Toxicogenomic data can be acquired for candidate compounds, even in a high-throughput platform, and a summarized profile of preserved and highly interpretable co-expression module responses can be extracted. The latter can be used to determine the prevalent mode of action and to compare with legacy and internal data. Preserved co-expression PHH modules can be further exploited to derive translational biomarkers to be included in high-throughput in vitro screening methods, or to explore donor-to-donor different response, but also to benchmark donor’s pool responses. Overall, we demonstrated that gene co-expression analysis coupled to a facile visualization environment, the PHH TXG-MAPr, is a promising approach to analyze in vitro human transcriptomic data and derive mechanistic interpretation, and therefore a substantial step forward towards integration of transcriptomic data in mechanistic risk assessment practices.
Average absolute eigengene score
Correlation eigengene score
Drug-induced liver injury
DNA damage response
Integrated stress response
Mode of Action
Primary human hepatocytes
Rat primary hepatocytes
Weighted gene co-expression network analysis
Alexa A, Rahnenführer J (2007) topGO: Enrichment Analysis for Gene Ontology. R package
Alvarez MJ, Shen Y, Giorgi FM, Lachmann A, Ding BB, Hilda Ye B, Califano A (2016) Functional characterization of somatic mutations in cancer using network-based inference of protein activity. Nat Genet 48(8):838–847. https://doi.org/10.1038/ng.3593
Bailey J, Balls M (2019) Recent efforts to elucidate the scientific validity of animal-based drug tests by the pharmaceutical industry, pro-testing lobby groups, and animal welfare organisations. BMC Med Ethics 20(1):16. https://doi.org/10.1186/s12910-019-0352-3
Barel G, Herwig R (2018) Network and pathway analysis of toxicogenomics data. Front Genet 9:484. https://doi.org/10.3389/fgene.2018.00484
Björnsson ES (2019) Global epidemiology of drug-induced liver injury (DILI). Curr Hepatol Rep 18(3):274–279. https://doi.org/10.1007/s11901-019-00475-z
Breiden B, Sandhoff K (2019) Emerging mechanisms of drug-induced phospholipidosis. Biol Chem 401(1):31–46. https://doi.org/10.1515/hsz-2019-0270
Campos G, Schmidt-Heck W, De Smedt J, Widera A, Ghallab A, Pütter L, González D, Edlund K, Cadenas C, Marchan R, Guthke R, Verfaillie C, Hetz C, Sachinidis A, Braeuning A, Schwarz M, Weiß TS, Banhart BK, Hoek J, Godoy P (2020) Inflammation-associated suppression of metabolic gene networks in acute and chronic liver disease. Arch Toxicol 94(1):205–217. https://doi.org/10.1007/s00204-019-02630-3
Chang W, Cheng J, Allaire JJ, Sievert C, Schloerke B, Xie Y, Allen J, McPherson J, Dipert A and Borges B (2021). shiny: Web Application Framework for R. R package version 1.6.0. https://cran.r-project.org/package=shiny
Clark M, Steger-Hartmann T (2018) A big data approach to the concordance of the toxicity of pharmaceuticals in animals and humans. Regul Toxicol Pharmacol 96:94–105. https://doi.org/10.1016/j.yrtph.2018.04.018
Cohen J (2013) Statistical power analysis for the behavioral sciences. L. Erlbaum Associates, Hillsdale, NJ. https://doi.org/10.4324/9780203771587
Colombo M, La Vecchia C, Lotti M, Lucena MI, Stove C, Paradis V, Newsome P (2019) EASL clinical practice guideline: occupational liver diseases. J Hepatol 71(5):1022-1037. https://doi.org/10.1016/j.jhep.2019.08.008
Csardi G, Nepusz T (2006). The igraph software package for complex network research, InterJournal, Complex Systems 1695. http://igraph.sf.net
Duvigneau JC, Luís A, Gorman AM, Samali A, Kaltenecker D, Moriggl R, Kozlov AV (2019) Crosstalk between inflammatory mediators and endoplasmic reticulum stress in liver diseases. Cytokine 124:154577. https://doi.org/10.1016/j.cyto.2018.10.018
Foufelle F, Fromenty B (2016) Role of endoplasmic reticulum stress in drug-induced toxicity. Pharmacol Res Perspect 4(1):e00211. https://doi.org/10.1002/prp2.211
Fredriksson L, Wink S, Herpers B, Benedetti G, Hadi M, De Bont H, Groothuis G, Luijten M, Danen E, De Graauw M, Meerman J, van de Water B (2014) Drug-induced endoplasmic reticulum and oxidative stress responses independently sensitize toward TNFα-mediated hepatotoxicity. Toxicol Sci 140(1):144–159. https://doi.org/10.1093/toxsci/kfu072
Garcia-Alonso L, Iorio F, Matchan A, Fonseca N, Jaaks P, Peat G, Pignatelli M, Falcone F, Benes CH, Dunham I, Bignell G, McDade SS, Garnett MJ, Saez-Rodriguez J (2018) Transcription factor activities enhance markers of drug sensitivity in cancer. Can Res 78(3):769–780. https://doi.org/10.1158/0008-5472.CAN-17-1679
Garcia-Alonso L, Holland CH, Ibrahim MM, Turei D, Saez-Rodriguez J (2019) Benchmark and integration of resources for the estimation of human transcription factor activities. Genome Res 29(8):1363–1375. https://doi.org/10.1101/gr.240663.118
García-Ruiz C, Fernández-Checa JC (2018) Mitochondrial oxidative stress and antioxidants balance in fatty liver disease. Hepatol Commun 2(12):1425–1439. https://doi.org/10.1002/hep4.1271
Gautier L, Cope L, Bolstad BM, Irizarry RA (2004) Affy—analysis of Affymetrix GeneChip data at the probe level. Bioinformatics 20(3):307–315. https://doi.org/10.1093/bioinformatics/btg405
Grinberg M, Stöber RM, Edlund K, Rempel E, Godoy P, Reif R, Widera A, Madjar K, Schmidt-Heck W, Marchan R, Sachinidis A, Spitkovsky D, Hescheler J, Carmo H, Arbo MD, van de Water B, Wink S, Vinken M, Rogiers V, Hengstler JG (2014) Toxicogenomics directory of chemically exposed human hepatocytes. Arch Toxicol 88(12):2261–2287. https://doi.org/10.1007/s00204-014-1400-x
Hetz C, Zhang K, Kaufman RJ (2020) Mechanisms, regulation and functions of the unfolded protein response. Nat Rev Mol Cell Biol 21(8):421–438. https://doi.org/10.1038/s41580-020-0250-z
Ideker T, Dutkowski J, Hood L (2011) Boosting signal-to-noise in complex biology: prior knowledge is power. Cell 144(6):860–863. https://doi.org/10.1016/j.cell.2011.03.007 (Elsevier)
Igarashi Y, Nakatsu N, Yamashita T, Ono A, Ohno Y, Urushidani T, Yamada H (2015) Open TG-GATEs: a large-scale toxicogenomics database. Nucleic Acids Res 43(D1):D921–D927. https://doi.org/10.1093/nar/gku955
Kamburov A, Stelzl U, Lehrach H, Herwig R (2013) The ConsensusPathDB interaction database: 2013 Update. Nucleic Acids Res 41(D1):D793. https://doi.org/10.1093/nar/gks1055
Karin M, Dhar D (2016) Liver carcinogenesis: from naughty chemicals to soothing fat and the surprising role of NRF2. Carcinogenesis 37(6):541–546. https://doi.org/10.1093/carcin/bgw060
Kawamoto T, Ito Y, Morita O, Honda H (2017) Mechanism-based risk assessment strategy for drug-induced cholestasis using the transcriptional benchmark dose derived by toxicogenomics. J Toxicol Sci 42(4):427–436. https://doi.org/10.2131/jts.42.427
Koido M, Kawakami E, Fukumura J, Noguchi Y, Ohori M, Nio Y, Nicoletti P, Aithal GP, Daly AK, Watkins PB, Anayama H, Dragan Y, Shinozawa T, Takebe T (2020) Polygenic architecture informs potential vulnerability to drug-induced liver injury. Nat Med 26(10):1541–1548. https://doi.org/10.1038/s41591-020-1023-0
Kolde R (2019) pheatmap: Pretty Heatmaps. R package. https://CRAN.R-project.org/package=pheatmap
Krewski D, Andersen ME, Tyshenko MG, Krishnan K, Hartung T, Boekelheide K, Wambaugh JF, Jones D, Whelan M, Thomas R, Yauk C, Barton-Maclaren T, Cote I (2020) Toxicity testing in the 21st century: progress in the past decade and future perspectives. Arch Toxicol 94(1):1–58. https://doi.org/10.1007/s00204-019-02613-4
Langfelder P, Horvath S (2008) WGCNA: An R package for weighted correlation network analysis. BMC Bioinformatics 9(1):559. https://doi.org/10.1186/1471-2105-9-559
Langfelder P, Luo R, Oldham MC, Horvath S (2011) Is my network module preserved and reproducible? PLoS Comput Biol 7(1):1001057. https://doi.org/10.1371/journal.pcbi.1001057
Lanzoni A, Castoldi AF, Kass GEN, Terron A, De Seze G, Bal-Price A, Bois FY, Delclos KB, Doerge DR, Fritsche E, Halldorsson T, Kolossa-Gehring M, Hougaard Bennekou S, Koning F, Lampen A, Leist M, Mantus E, Rousselle C, Siegrist M, Younes M (2019) Advancing human health risk assessment. EFSA J 17(S1):170712. https://doi.org/10.2903/j.efsa.2019.e170712
Liu Z, Huang R, Roberts R, Tong W (2019) Toxicogenomics: a 2020 vision. Trends Pharmacol Sci 40(2):92–103. https://doi.org/10.1016/j.tips.2018.12.001 (Elsevier Ltd)
Love MI, Huber W, Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol 15(12):550. https://doi.org/10.1186/s13059-014-0550-8
MacRae SL, Croken MMK, Calder RB, Aliper A, Milholland B, White RR, Zhavoronkov A, Gladyshev VN, Seluanov A, Gorbunova V, Zhang ZD, Vijg J (2015) DNA repair in species with extreme lifespan differences. Aging 7(12):1171–1184. https://doi.org/10.18632/aging.100866
Mandrekar P, Catalano D, Jeliazkova V, Kodys K (2008) Alcohol exposure regulates heat shock transcription factor binding and heat shock proteins 70 and 90 in monocytes and macrophages: implication for TNF-α regulation. J Leukoc Biol 84(5):1335–1345. https://doi.org/10.1189/jlb.0407256
Martin LJ, Chang Q (2018) DNA damage response and repair, DNA methylation, and cell death in human neurons and experimental animal neurons are different. J Neuropathol Exp Neurol 77(7):636–655. https://doi.org/10.1093/jnen/nly040
Mav D, Shah RR, Howard BE, Auerbach SS, Bushel PR, Collins JB, Gerhold DL, Judson RS, Karmaus AL, Maull EA, Mendrick DL, Merrick BA, Sipes NS, Svoboda D, Paules RS (2018) A hybrid gene selection approach to create the S1500+ targeted gene sets for use in high-throughput transcriptomics. PLoS One 13(2):e0191105. https://doi.org/10.1371/journal.pone.0191105
Mav D, Phadke DP, Balik-Meisner MR, Merrick BA, Auerbach S, Niemeijer M, Huppelschoten S, Baze A, Parmentier C, Richert L, van de Water B, Shah RR, Paules RS (2020) Utility of extrapolating human S1500+ genes to the whole transcriptome: tunicamycin case study. Bioinform Biol Insights 14:117793222095274. https://doi.org/10.1177/1177932220952742
Monroe JJ, Tanis KQ, Podtelezhnikov AA, Nguyen T, Machotka SV, Lynch D, Evers R, Palamanda J, Miller RR, Pippert T, Cabalu TD, Johnson TE, Aslamkhan AG, Kang W, Tamburino AM, Mitra K, Agrawal NGB, Sistare FD (2020) Application of a rat liver drug bioactivation transcriptional response assay early in drug development that informs chemically reactive metabolite formation and potential for drug-induced liver injury. Toxicol Sci 177(1):281–299. https://doi.org/10.1093/toxsci/kfaa088
Niemeijer M, Wiecek W, Huppelschoten W, Bouwman P, Baze A, Parmentier C, Richert L, Paules RS, Bois FY, van de Water B (2021) Transcriptomic mapping of the inter-individual variability of cellular stress response activation in primary human hepatocytes. bioRxiv 2021.08.26.457742. https://doi.org/10.1101/2021.08.26.457742
Morin MJ, Bernacki RJ (1983) Biochemical effects and therapeutic potential of tunicamycin in murine L1210 leukemia. Can Res 43(4):1669–1674
Onakpoya IJ, Heneghan CJ, Aronson JK (2016) Post-marketing withdrawal of 462 medicinal products because of adverse drug reactions: a systematic review of the world literature. BMC Med 14(1):10. https://doi.org/10.1186/s12916-016-0553-2
Osataphan N, Phrommintikul A, Chattipakorn SC, Chattipakorn N (2020) Effects of doxorubicin-induced cardiotoxicity on cardiac mitochondrial dynamics and mitochondrial function: Insights for future interventions. J Cell Mol Med 24(12):6534–6557. https://doi.org/10.1111/jcmm.15305
Pakos-Zebrucka K, Koryga I, Mnich K, Ljujic M, Samali A, Gorman AM (2016) The integrated stress response. EMBO Rep 17(10):1374–1395. https://doi.org/10.15252/embr.201642195
Paradis E, Schliep K (2019) Ape 5.0: an environment for modern phylogenetics and evolutionary analyses in R. Bioinformatics 35(3):526–528. https://doi.org/10.1093/bioinformatics/bty633
Parish ST, Aschner M, Casey W, Corvaro M, Embry MR, Fitzpatrick S, Kidd D, Kleinstreuer NC, Lima BS, Settivari RS, Wolf DC, Yamazaki D, Boobis A (2020) An evaluation framework for new approach methodologies (NAMs) for human health safety assessment. Regul Toxicol Pharmacol 112:104592. https://doi.org/10.1016/j.yrtph.2020.104592
Peng C, Stewart AG, Woodman OL, Ritchie RH, Qin CX (2020) Non-alcoholic steatohepatitis: a review of its mechanism models and medical treatments. Front Pharmacol. https://doi.org/10.3389/fphar.2020.603926
Perkins E, Garcia-Reyero N, Edwards S, Wittwehr C, Villeneuve D, Lyons D, Ankley G (2015) The adverse outcome pathway: a conceptual framework to support toxicity testing in the twenty-first century. Computational systems toxicology. Humana Press, New York, NY, pp 1–26. https://doi.org/10.1007/978-1-4939-2778-4_1
Phillips JR, Svoboda DL, Tandon A, Patel S, Sedykh A, Mav D, Kuo B, Yauk CL, Yang L, Thomas RS, Gift JS, Allen Davis J, Olszyk L, Alex Merrick B, Paules RS, Parham F, Saddler T, Shah RR, Auerbach SS (2019) BMD Express 2: enhanced transcriptomic dose-response analysis workflow. Bioinformatics 35(10):1780–1782. https://doi.org/10.1093/bioinformatics/bty878
Podtelezhnikov AA, Monroe JJ, Aslamkhan AG, Pearson K, Qin C, Tamburino AM, Loboda AP, Glaab WE, Sistare FD, Tanis KQ (2020) Quantitative transcriptional biomarkers of xenobiotic receptor activation in rat liver for the early assessment of drug safety liabilities. Toxicol Sci 175(1):98–112. https://doi.org/10.1093/toxsci/kfaa026
Rana P, Aleo MD, Gosink M, Will Y (2019) Evaluation of in vitro mitochondrial toxicity assays and physicochemical properties for prediction of organ toxicity using 228 pharmaceutical drugs. Chem Res Toxicol 32(1):156–167. https://doi.org/10.1021/acs.chemrestox.8b00246 (American Chemical Society)
Reuben A, Tillman H, Fontana RJ, Davern T, Mcguire B, Stravitz RT, Durkalski V, Larson AM, Liou I, Fix O, Schilsky M, Mccashland T, Hay JE, Murray N, Shaikh OS, Ganger D, Zaman A, Han SB, Chung RT, Lee WM (2016) Outcomes in adults with acute liver failure between 1998 and 2013: an observational cohort study. Ann Intern Med 164(11):724–732. https://doi.org/10.7326/M15-2211
Rezzani R (2004) Cyclosporine A and adverse effects on organs: Histochemical studies. Prog Histochem Cytochem 39(2):85–128. https://doi.org/10.1016/j.proghi.2004.04.001
Ritchie ME, Phipson B, Wu D, Hu Y, Law CW, Shi W, Smyth GK (2015) Limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res 43(7):e47. https://doi.org/10.1093/nar/gkv007
Rivetti C, Allen TEH, Brown JB, Butler E, Carmichael PL, Colbourne JK, Dent M, Falciani F, Gunnarsson L, Gutsell S, Harrill JA, Hodges G, Jennings P, Judson R, Kienzler A, Margiotta-Casaluci L, Muller I, Owen SF, Rendal C, Campos B (2020) Vision of a near future: bridging the human health–environment divide. Toward an integrated strategy to understand mechanisms across species for chemical safety assessment. Toxicol in Vitro 62:104692. https://doi.org/10.1016/j.tiv.2019.104692 (Elsevier Ltd)
Ron D, Walter P (2007) Signal integration in the endoplasmic reticulum unfolded protein response. Nat Rev Mol Cell Biol 8(7):519–529. https://doi.org/10.1038/nrm2199 (Nature Publishing Group)
Safe S, Jin UH, Park H, Chapkin RS, Jayaraman A (2020) Aryl hydrocarbon receptor (AHR) ligands as selective AHR modulators (SAhRMs). Int J Mol Sci 21(18):6654. https://doi.org/10.3390/ijms21186654
Sax NI (1975) Dangerous properties of industrial materials - 4th edition. Van Nostrand Reinhold. ISBN 978-0442273682
Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, Amin N, Schwikowski B, Ideker T (2003) Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res 13(11):2498–2504. https://doi.org/10.1101/gr.1239303
Smith JR, Hayman GT, Wang SJ, Laulederkind SJF, Hoffman MJ, Kaldunski ML, Tutaj M, Thota J, Nalabolu HS, Ellanki SLR, Tutaj MA, De Pons JL, Kwitek AE, Dwinell MR, Shimoyama ME (2020) The year of the rat: the rat genome database at 20: a multi-species knowledgebase and analysis platform. Nucleic Acids Res 48(D1):D731–D742. https://doi.org/10.1093/nar/gkz1041
Solotke MT, Dhruva SS, Downing NS, Shah ND, Ross JS (2018) New and incremental FDA black box warnings from 2008 to 2015. Expert Opin Drug Saf 17(2):117–123. https://doi.org/10.1080/14740338.2018.1415323
Soufan O, Ewald J, Viau C, Crump D, Hecker M, Basu N, Xia J (2019) T1000: a reduced gene set prioritized for toxicogenomic studies. PeerJ 7:e7975. https://doi.org/10.7717/peerj.7975
Stacklies W, Redestig H, Scholz M, Walther D, Selbig J (2007) pcaMethods—a bioconductor package providing PCA methods for incomplete data. Bioinformatics 23(9):1164–1167. https://doi.org/10.1093/bioinformatics/btm069
Sutherland JJ, Jolly RA, Goldstein KM, Stevens JL (2016) Assessing concordance of drug-induced transcriptional response in rodent liver and cultured hepatocytes. PLoS Comput Biol 12(3):e1004847. https://doi.org/10.1371/journal.pcbi.1004847
Sutherland JJ, Webster YW, Willy JA, Searfoss GH, Goldstein KM, Irizarry AR, Hall DG, Stevens JL (2018) Toxicogenomic module associations with pathogenesis: a network-based approach to understanding drug toxicity. Pharmacogenomics J 18(3):377–390. https://doi.org/10.1038/tpj.2017.17
Szklarczyk D, Franceschini A, Wyder S, Forslund K, Heller D, Huerta-Cepas J, Simonovic M, Roth A, Santos A, Tsafou KP, Kuhn M, Bork P, Jensen LJ, Von Mering C (2015) STRING v10: Protein-protein interaction networks, integrated over the tree of life. Nucleic Acids Res 43(D1):D447–D452. https://doi.org/10.1093/nar/gku1003
Troyanskaya O, Cantor M, Sherlock G, Brown P, Hastie T, Tibshirani R, Botstein D, Altman RB (2001) Missing value estimation methods for DNA microarrays. Bioinformatics 17(6):525. https://doi.org/10.1093/bioinformatics/17.6.520
Vahle JL, Anderson U, Blomme EAG, Hoflack JC, Stiehl DP (2018) Use of toxicogenomics in drug safety evaluation: current status and an industry perspective. Regul Toxicol Pharmacol 96:18–29. https://doi.org/10.1016/j.yrtph.2018.04.011
Van den Hof WFPM, Ruiz-Aracama A, Van Summeren A, Jennen DGJ, Gaj S, Coonen MLJ, Brauers K, Wodzig WKWH, van Delft JHM, Kleinjans JCS (2015) Integrating multiple omics to unravel mechanisms of Cyclosporin A induced hepatotoxicity in vitro. Toxicol in Vitro 29(3):489–501. https://doi.org/10.1016/j.tiv.2014.12.016
Van Summeren A, Renes J, Lizarraga D, Bouwman FG, Noben JP, Van Delft JHM, Kleinjans JCS, Mariman ECM (2013) Screening for drug-induced hepatotoxicity in primary mouse hepatocytes using acetaminophen, amiodarone, and cyclosporin A as model compounds: an omics-guided approach. OMICS 17(2):71–83. https://doi.org/10.1089/omi.2012.0079
Vickers AEM, Ulyanov AV, Fisher RL (2017) Liver effects of clinical drugs differentiated in human liver slices. Int J Mol Sci 18(3):574. https://doi.org/10.3390/ijms18030574
Watkins PB (2011) Drug safety sciences and the bottleneck in drug development. Clin Pharmacol Ther 89(6):788–790. https://doi.org/10.1038/clpt.2011.63
Weaver RJ, Blomme EA, Chadwick AE, Copple IM, Gerets HHJ, Goldring CE, Guillouzo A, Hewitt PG, Ingelman-Sundberg M, Jensen KG, Juhila S, Klingmüller U, Labbe G, Liguori MJ, Lovatt CA, Morgan P, Naisbitt DJ, Pieters RHH, Snoeys J, Park BK (2020) Managing the challenge of drug-induced liver injury: a roadmap for the development and deployment of preclinical predictive models. Nat Rev Drug Discov 19(2):131–148. https://doi.org/10.1038/s41573-019-0048-x
Wickham H (2016) ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York, NY. ISBN 978-3-319-24277-4. https://ggplot2.tidyverse.org
Wink S, Hiemstra S, Huppelschoten S, Danen E, Niemeijer M, Hendriks G, Vrieling H, Herpers B, Van De Water B (2014) Quantitative high content imaging of cellular adaptive stress response pathways in toxicity for chemical safety assessment. Chem Res Toxicol 27(3):338–355. https://doi.org/10.1021/tx4004038
Wolters JEJ, Van Herwijnen MHM, Theunissen DHJ, Jennen DGJ, Van Den Hof WFPM, De Kok TMCM, Schaap FG, Van Breda SGJ, Kleinjans JCS (2016) Integrative “-omics” analysis in primary human hepatocytes unravels persistent mechanisms of cyclosporine a-induced cholestasis. Chem Res Toxicol 29(12):2164–2174. https://doi.org/10.1021/acs.chemrestox.6b00337
Woolbright B, Jaeschke H (2017) The impact of sterile inflammation in acute liver injury. J Clin Transl Res 3(1):170–188. https://doi.org/10.18053/jctres.03.2017s1.003
WHO-IPCS (2018) Guidance document on evaluating and expressing uncertainty in hazard characterization - 2nd edition. IPCS harmonization project document 11. Geneva: World Health Organization. ISBN 978-92-4-151354-8
Yang Y, Nadanaciva S, Will Y, Woodhead JL, Howell BA, Watkins PB, Siler SQ (2015) MITOsym®: a mechanistic, mathematical model of hepatocellular respiration and bioenergetics. Pharm Res 32(6):1975–1992. https://doi.org/10.1007/s11095-014-1591-0
Yin W, Mendoza L, Monzon-Sandoval J, Urrutia AO, Gutierrez H (2021) Emergence of co-expression in gene regulatory networks. PLoS ONE 16(4):e0247671. https://doi.org/10.1371/journal.pone.0247671
Yorita Christensen KL, Carrico CK, Sanyal AJ, Gennings C (2013) Multiple classes of environmental chemicals are associated with liver disease: NHANES 2003–2004. Int J Hyg Environ Health 216(6):703–709. https://doi.org/10.1016/j.ijheh.2013.01.005
Zhang J, Venkat D (2020) Frequent offenders and patterns of injury. Clin Liver Dis 24(1):37–48. https://doi.org/10.1016/j.cld.2019.09.002
Zhang S, Wang C, Tang S, Deng S, Zhou Y, Dai C, Yang X, Xiao X (2014) Inhibition of autophagy promotes caspase-mediated apoptosis by tunicamycin in HepG2 cells. Toxicol Mech Methods 24(9):654–665. https://doi.org/10.3109/15376516.2014.956915
Zhang L, Dong Y, Wang W, Zhao T, Huang T, Khan A, Wang L, Liu Z, Xie J, Niu B (2020) Ethionine suppresses mitochondria autophagy and induces apoptosis via activation of reactive oxygen species in neural tube defects. Front Neurol 11:242. https://doi.org/10.3389/fneur.2020.00242
Zhang B, Horvath S (2005) A General Framework for Weighted Gene Co-Expression Network Analysis. Stat Appl Genet Mol Biol. https://doi.org/10.2202/1544-6115.1128
This work was supported by the EU-EFPIA Innovative Medicines Initiative 2 (IMI2) Joint Undertaking TransQST project (grant number 116030) and eTRANSAFE project (grant number 777365) (this Joint Undertaking receives support from the European Union’s Horizon 2020 research and innovation program and EFPIA), the EC Horizon2020 EU-ToxRisk project (grant number 681002), and the Cosmetics Europe/CEFIC Liver Ontology project.
Conflict of interest
JSR received funding from GSK and Sanofi and consultant fees from Travere Therapeutics.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Below is the link to the electronic supplementary material.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Callegaro, G., Kunnen, S.J., Trairatphisan, P. et al. The human hepatocyte TXG-MAPr: gene co-expression network modules to support mechanism-based risk assessment. Arch Toxicol 95, 3745–3775 (2021). https://doi.org/10.1007/s00204-021-03141-w
- Mechanism-based risk assessment
- ER stress