Recommended approaches in the application of toxicogenomics to derive points of departure for chemical risk assessment

There is increasing interest in the use of quantitative transcriptomic data to determine benchmark dose (BMD) and estimate a point of departure (POD) for human health risk assessment. Although studies have shown that transcriptional PODs correlate with those derived from apical endpoint changes, there is no consensus on the process used to derive a transcriptional POD. Specifically, the subsets of informative genes that produce BMDs that best approximate the doses at which adverse apical effects occur have not been defined. To determine the best way to select predictive groups of genes, we used published microarray data from dose–response studies on six chemicals in rats exposed orally for 5, 14, 28, and 90 days. We evaluated eight approaches for selecting genes for POD derivation and three previously proposed approaches (the lowest pathway BMD, and the mean and median BMD of all genes). The relationship between transcriptional BMDs derived using these 11 approaches and PODs derived from apical data that might be used in chemical risk assessment was examined. Transcriptional BMD values for all 11 approaches were remarkably aligned with corresponding apical PODs, with the vast majority of toxicogenomics PODs being within tenfold of those derived from apical endpoints. We identified at least four approaches that produce BMDs that are effective estimates of apical PODs across multiple sampling time points. Our results support that a variety of approaches can be used to derive reproducible transcriptional PODs that are consistent with PODs produced from traditional methods for chemical risk assessment. Electronic supplementary material The online version of this article (doi:10.1007/s00204-016-1886-5) contains supplementary material, which is available to authorized users.


Introduction
Animal-based toxicity testing is expensive, time-consuming, and requires large numbers of animals. For example, the National Toxicology Program (NTP) estimates that a rodent cancer bioassay requires 860 animals, $2-$4 million, and 5 years to plan, conduct, and evaluate. As a result, the NTP has generally only conducted an average of 12 cancer bioassays/year since this program was launched in the 1970s. Further, due to the need for animal-based toxicity data for hazard identification and dose-response analysis for many chemicals requiring risk assessment, the US Abstract There is increasing interest in the use of quantitative transcriptomic data to determine benchmark dose (BMD) and estimate a point of departure (POD) for human health risk assessment. Although studies have shown that transcriptional PODs correlate with those derived from apical endpoint changes, there is no consensus on the process used to derive a transcriptional POD. Specifically, the subsets of informative genes that produce BMDs that best approximate the doses at which adverse apical effects occur have not been defined. To determine the best way to select predictive groups of genes, we used published microarray data from dose-response studies on six chemicals in rats exposed orally for 5, 14, 28, and 90 days. We Environmental Protection Agency's (EPA) Integrated Risk Information System (IRIS) has only evaluated 570 chemicals since the EPA created the IRIS program (1985( ) up until March, 2016. In Canada, an important challenge is the requirement to assess the potential for risk to human health of a large number of existing chemicals in a short timeframe. Specifically, the Government of Canada, under the Chemicals Management Plan launched in 2006, has a commitment to address 4300 existing substances identified as priorities by 2020; many of these substances have a paucity of traditional toxicology data (Barton-Maclaren et al. 2016). Traditional whole animal testing is not feasible for the regulatory testing of all chemicals requiring evaluation; hence, there has been an increase in the need for, and application of, non-traditional tools and approaches to support decision-making as the program progresses.
In light of the tens of thousands of chemicals in commerce, and the thousands of new chemicals being developed annually, an urgent need to shift from these conventional toxicology tests toward higher throughput mechanistic and quantitative approaches has been identified (Committee on Toxicity Testing and Assessment of Environmental Agents 2007; Council of Canadian Academies 2016; Firestone et al. 2010;NRC 2009NRC , 2010. Multiple alternative approaches have been proposed to improve and expedite chemical testing, including the application of toxicogenomics, cell-based assays, high-throughput testing, and computational modeling. In particular, toxicogenomics has been identified as important in the next generation of risk science Cote et al. 2016;Guyton et al. 2009;Krewski et al. 2014).
Alteration in mRNA expression following chemical exposure is one of the earliest quantifiable effects in a toxicological response. Genomics technologies, such as DNA microarrays and RNA-sequencing (RNA-seq), measure global transcriptional changes in a tissue or cell type following chemical exposure. Abundant evidence indicates that changes in mRNA levels occur during chemical toxicity and that characterizing these changes can provide meaningful information for toxicological assessment (Thomas et al. 2001. Analyzing mRNA expression changes in cells or tissues following toxicant exposure offers a new dimension to hazard and mode of action (MOA) identification, assessment of human and animal variability in response to chemicals, and estimation of the doses at which adverse non-cancer and cancer effects occur (Hester et al. 2015;Jackson et al. 2014;Labib et al. 2015;Moffat et al. 2015;Thomas et al. 2007Thomas et al. , 2011Thomas et al. , 2012Webster et al. 2015a). The use of transcriptomic data has been suggested for informing the MOA of a chemical as part of a weight of evidence in risk assessment NRC 2007), and recently it has been proposed that quantitative transcriptomic data may be used to determine benchmark dose (BMD) to estimate a chemical's point of departure (POD) ; Thomas et al. 2013b;Webster et al. 2015a).
Numerous studies have applied BMD modeling to analyze dose-response relationships for global gene expression data. These studies have found that transcriptional PODs are in agreement with PODs derived using apical endpoints (e.g., histology, organ weight, cancer, etc.) Black et al. 2014;Bourdon et al. 2013;Dong et al. 2015; Thomas et al. 2007Thomas et al. , 2013aWebster et al. 2015a). For example, Thomas et al. (Thomas et al. 2013a, b) reported a high degree of correlation between transcriptional BMD values for the 'most sensitive pathway' (i.e., the lowest median pathway BMD) and BMD values for apical endpoints in rodent dose-response experiments for six chemicals over a variety of exposure durations (5-, 14-, 28-, and 90-day exposures). A study on mice exposed to the carcinogen furan for 3 weeks reported that the overall mean/median pathway BMD was consistent with hepatocellular adenoma and carcinoma BMDs from both DNA microarrays data and RNA-seq data (Webster et al. 2015a). It has also been demonstrated that BMDs for pathways associated with key events in a chemical's MOA are good predictors of the doses at which associated adverse apical effects occur (Bhat et al. 2013;Chepelev et al. 2015a, b;Hester et al. 2015;Webster et al. 2015a). Overall, there is agreement that transcriptional BMDs representing defined groups of genes provide a potentially effective approach for establishing a POD, but there is no consensus on how the genes that are used should be selected (e.g., lowest pathway BMD, BMD from genes in a pathway that is associated with a key event in the chemical's MOA, overall median/ mean transcriptional BMD, etc.) and what the repercussions of different agencies applying different approaches might be on the ultimate PODs used in risk assessment. Thus, a thorough analysis of different methods for estimating a transcriptional BMD and recommendations for best practices are required before application of transcriptional PODs in human health risk assessment can be realized.
The primary purpose of this study was to expand on the previous work described above by systematically exploring different approaches to selecting genes from transcriptional data to derive a POD, and comparing these to PODs derived from apical endpoints within the same rodent models. To do this, we leveraged published Affymetrix microarray data on well-designed dose-response studies in rats with matching (derived from the same rats) apical data. BMDExpress ) was used to derive BMD and lower confidence limit benchmark dose (BMDL) values for gene expression. Eleven approaches to select groups of genes and molecular pathways that are relevant to a chemical's toxicity were evaluated for derivation of a transcriptional BMD that is representative of the chemical response (Table 1). Three of these approaches are based on methods proposed previously, including BMD mean or median of all responsive pathways, and the lowest pathway BMD Webster et al. 2015a). These BMD(L) values were compared to the BMD(L)s for conventional toxicology endpoints from the same animals, and to cancer bioassay results published elsewhere. Because MOA-determination is time-consuming, the present work focuses on approaches that can be applied for relatively rapid transcriptional POD derivation without MOA knowledge; this provides an additional advantage for the application of the approach for the assessment of substances with limited toxicity data. The overarching goal of this work is to advance the utility and application of transcriptional BMDs for use in human health risk assessment.

Materials and methods
The data used in this study were publicly available and were selected for use because they contain both transcriptomic and matching apical endpoints across a variety of time points and across a dose-range spanning the no observed effects levels and the maximum tolerated doses. Thus, they provide an ideal dataset for this analysis. Details of the original experiment are described below.

Summary of experimental model and previously published work
Details of the study design, animal exposures, necropsy, histology, serum clinical chemistry, blood concentration of chemicals, and microarray methods were reported previously (Dodd et al. 2012a(Dodd et al. , b, c, d, 2013aThomas et al. 2013a, b) (Fig. 1). In brief, (1) male Sprague-Dawley rats were administered 1,2,4-tribromobenzene (TBB; CAS No. 615-54-3) at 0, 2.5, 5, 10, 25, or 75 mg/kg per day (mkd) by oral gavage; (2) male Fischer 344 (F344) rats were administered bromobenzene (BB; CAS No. 108-86-1) at 0, 25, 100, 200, 300, or 400 mkd by oral gavage; (3) male Sprague-Dawley rats were administered 2,3,4,6-tetrachlorophenol (TCP; CAS No. 58-90-2) at 0, 10, 25, 50, 100, or 200 mkd by oral gavage; (4) female F344 rats were exposed to 4,4′-methylenebis(N,N′-dimethyl)aniline (CAS number 101-61-1; MDA) in feed at six doses (0, 50, 200, 375, 500, or 750 ppm); (5) female F344 rats were exposed to N-nitrosodiphenylamine (NDPA; CAS No. 86-30-6) by dietary feed at 0, 250, 1000, 2000, or 4000 ppm; and (6) female F344 rats were exposed to hydrazobenzene (HZB; CAS No. 122-66-7) by dietary feed at concentrations of 0, 5, 20, 80, 200, or 300 ppm ( Table 2). The selected rodent strain, sex, and route of exposure were those that showed the critical effect in the previous IRIS assessments for these chemicals. Feed or oral gavage concentrations were selected based on the doses used in the IRIS reviews. Rats were administered or fed the chemicals for 5, 14, 28, or 90 days. Two hundred and forty-five 4-to 5-week old rats were used for each chemical (n = 10 per group). Clinical signs of toxicity, body weights, and food consumption of animals were checked daily. Necropsies were conducted at scheduled time points. Following gross examination for abnormalities, the target organs (target organs were selected based on previous studies of the test article) were removed, weighed, and prepared for histopathological assessment and gene expression microarray measurements. Fig. 1 Study method overview. White boxes show animal exposures, necropsies, histology and microarray procedures that were conducted in previous studies (Dodd et al. 2012a(Dodd et al. , b, c, d, 2013aThomas et al. 2013a, b). Blue boxes represent procedures that were conducted in the current study (color figure online) For RNA analysis, target tissues were either flash frozen (TBB, TCP, and HZB) or preserved in RNAlater (BB, MDA, and NDPA) at the time of necropsy. RNA isolated from the primary target tissues of six rats per dose per time point was analyzed using Affymetrix microarrays. Because all of the chemicals selected in this study had published toxicological data, target tissues were known in advance. Target tissues were liver (TBB, BB, TCP and HZB), thyroid (MDA), and bladder (NDPA). DNA microarray hybridization was performed for 16 h on HT Rat230 + PM microarrays. The complete microarray datasets were downloaded from the Gene Expression Omnibus (GEO) at the NCBI (Accession No. GSE45892).
Gene expression data were normalized using robust multi-array average (RMA) (Irizarry et al. 2003) and log2 transformed.

Benchmark dose calculation
Because we are modeling both apical and transcriptional changes, to avoid confusion, we refer to a BMD derived from an apical endpoint as a BMD a , and from a transcriptional endpoint as a BMD t throughout the manuscript. When both BMD and BMDL are discussed, BMD(L) is used. Unless otherwise stated, BMD(L) a values in the current paper refer to non-cancer apical endpoints.

BMD analysis of apical effects (BMD a )
BMD(L) a values were modeled for apical endpoint measurements [described in Dodd et al. (2012aDodd et al. ( , b, c, d, 2013a] using the US EPA's BMD Software (BMDS, version 2.60) (Davis et al. 2011) with BMDS Wizard (ICF international, version 1.1). The dose-dependent histological changes including hypertrophy, hyperplasia, necrosis, and vacuolation were fit as dichotomous data, and organ weight increases were fit as continuous data. All models specified in the BMD modeling guidelines (U.S. EPA 2012) were run for the appropriate data type (dichotomous data: Gamma, Dichotomous-Hill, Logistic, LogLogistic, Probit, LogProbit, Weibull, and Multistage; continuous data: Exponential 4, Exponential 5, Hill, Power, Polynomial, and Linear). The best-fitting model was selected based on adequacy of the fit of the model to the data using automated rules in BMDS wizard (U.S. EPA 2012). No manual interpretation of results was performed; BMD(L) a values were selected based on the program recommendation as described previously (Wignall et al. 2014). The BMDS wizard categorized fitted models into Viable, Questionable or Unusable. Only Viable model outputs were used in this study. If no model was Viable, the highest dose was removed and the models were re-run. If no model was Viable after removing the highest doses and only three doses remained (including control), the dose-response dataset was reported as having failed BMD modeling. For BMD a values that were higher than the highest dose, the BMD(L) a was not considered and "failed BMD modeling" was recorded.
The BMD a calculations for cancer are described in detail in a previous study ). Briefly, cancer BMD a values for thyroid carcinoma/adenoma incidences, bladder carcinoma, and liver carcinomas for MDA, NDPA, and HZB, respectively, were calculated using BMDS software.

BMD analysis of transcriptional changes (BMD t )
BMDExpress was used for dose-response modeling and BMD t estimation for each gene . A statistical test was used as a pre-filter to identify genes that were significantly altered in at least one dose group relative to concurrent control rats (statistical test applied for each approach is described in Table 1 and described in more detail below). Subsequently, a best fit model (Hill, Power, Linear, Polynomial 2°, or Polynomial 3°) was identified for each gene based on: (1) a nested Chi-square test cutoff of 0.05 to choose between linear and polynomial models; (2) the least complexity based on the Akaike Information Criterion (AIC) for the Linear, Polynomial, Hill, and Power models; and (3) a goodness-of-fit test p value >0.1. Other parameters included: power restricted to ≥1, maximum iterations of 250 (the convergence criteria for the model fitting), confidence interval of 0.95, and benchmark response (BMR, the number of standard deviations at which the BMD is defined) set to 1.349 ). The Hill model was flagged if the "k" parameter was <1/3 of the lowest positive dose (Black et al. 2012). In such cases, the next best model with a goodness-of-fit test p value >0.05 was selected. In the case where no model had a p value >0.05, probes that fit Hill models were considered and the lowest BMD t value (only BMD t derived from Hill models excluding flagged models) was multiplied by 0.5 for use in subsequent analyses.
Using the built-in defined category analysis feature, probes with BMD t s were mapped to Ingenuity Pathway Analysis (IPA) canonical pathways (downloaded on April 24, 2014). Promiscuous probes (probes annotated with more than one gene), as well as probes with BMD t s higher than the highest dose and goodness-of-fit test p value <0.1, were removed.

Identification of differentially expressed genes
Differentially expressed genes (those with mRNA levels that were significantly increased or decreased following chemical exposure relative to concurrent controls) were identified using microarray analysis of variance (MAANOVA). This analysis was conducted in R (R Core Team 2015) using the MAANOVA library (Wu et al. 2003). The Fs statistic (Cui et al. 2005) was used to test for treatment effects. The p values were estimated by the permutation method with residual shuffling followed by the false discovery rate (FDR) adjustment (Benjamini and Hochberg 2007). The fold-change estimates were determined using least-square means (Goodnight and Harvey 1978;Searle et al. 1980).

Ingenuity Pathway Analysis (IPA)
Gene expression data were analyzed using IPA (QIAGEN Redwood City, www.qiagen.com/ingenuity) to identify significant enrichment of genes in specific molecular pathways and to predict activated upstream regulators. IPA Core Analysis with a gene expression threshold of fold change ≥±1.5 and FDR p ≤ 0.05 was run, and enriched canonical pathways that were statistically significant (p ≤ 0.05) were selected. IPA calculates the p value using the right-tailed Fisher's exact test. In this method, the p value for a given pathway is calculated by considering the number of differentially expressed genes (FC ≥±1.5 and FDR p ≤ 0.05) that participate in that pathway and the total number of genes that are known to be associated with that pathway.

BMDExpress Data Viewer
Pathway enrichment analysis using the BMDExpress Data Viewer tools are described in detail elsewhere . Briefly, the datasets derived from BMDExpress were used in this analysis. The Affymetrix probe sets were first converted into unique genes (based on NCBI Entrez Gene ID). The output files (BMD t analyzed and IPA mapped files) were uploaded to the BMDExpress Data Viewer Functional Enrichment Analysis tool. This tool performs enrichment analysis using a Fisher's exact test. The Fisher's exact test is identical to conventional pathway analysis (e.g., IPA), except it only applies the analysis to genes that passed the BMD t filtering criteria and have a BMD t . Thus, it explores pathway enrichment for genes that show a dose-response and are significantly increased in at least one dose group relative to controls. A list of significant pathways (p < 0.05) for each dataset is obtained. Only pathways with four or more genes with BMD t values were considered.

Transcriptomic-based approaches to predict POD
Eight novel approaches were explored to select and group genes for BMD t analysis (Approaches 1-8). The specific details of each approach are described in Table 1 along with a very brief rationale for why this approach was considered. The mean of the gene BMD t s within the groups defined by these approaches was calculated. The correlation between the BMD t s derived for each approach and the BMD a values (as well as no and lowest observable adverse effects levels: NOAEL and LOAEL, respectively) was computed. We also used three approaches (Approaches 9-11) that have been used previously Webster et al. 2015b) to derive BMD t s (details again provided in Table 1). The BMD t s derived from each of the 11 approaches are presented and analyzed as potential PODs for application in risk assessment. We note that previous approaches to derived transcriptional BMDs have applied different statistical filtering prior to BMD modeling, ranging from no statistical filter to a conservative filter of FDR p ≤ 0.05 and an unadjusted p ≤ 0.05. We previously showed that it is important to pre-filter global transcriptional data prior to BMD t modeling to reduce noise (Webster et al. 2015a). Thus, we applied two analytical methods for pre-filtering data-MAANOVA FDR p ≤ 0.05 and ANOVA unadjusted p ≤ 0.05 (Table 1) to include both a conservative and liberal filter, respectively, in the present study. In Approaches 1, 2, 4, 5, 6, 7, and 8, microarray data were normalized using RMA, a MAANOVA was performed, and genes with FDR p values ≤0.05 and fold changes ≥1.5 were imported into BMDExpress. In Approaches 3, 9, 10, and 11, RMA normalized data were directly imported into BMDExpress and analyzed by ANOVA (retaining genes with p < 0.05). The subsequent BMD-Express analysis was similar for all approaches following these pre-filtering steps (Table 1). For our analysis, a pathway was only assigned a BMD t if it had a minimum of four genes with BMDs within that pathway. The approaches are also described in detail in Supplementary methods. The mean BMD for each gene with a BMD within a pathway was used to represent the BMD t for that pathway. BMDs derived from each approach represent the mean gene expression BMD for the following groups of genes: • Approach 1-The 20 significantly enriched pathways with the lowest BMD t s. • Approach 2-The 20 most statistically significantly enriched pathways. • Approach 3-The 20 lowest pathway BMD t s.
• Approach 4-The 20 genes with the largest fold changes relative to controls. • Approach 5-Genes with BMD t s within the 25th-75th percentile. • Approach 6-The 20 pathways with the greatest number of shared genes. • Approach 7-The 20 genes that contribute to the greatest number of enriched pathways. • Approach 8-The BMDs of genes that are regulated by the 20 most significant upstream regulators. • Approach 9-The significantly enriched pathway with the lowest BMD t (i.e., most sensitive pathway). • Approach 10-The mean of gene BMDs across all pathways. • Approach 11-The median gene BMD across all pathways.

Estimating distributions for each approach
Distributions for the mean or median (Approach 11 only) BMD t for the 11 approaches were estimated using the bootstrap (Efron 1993). BMD t estimates for genes or for genes within pathways were randomly selected with replacement. For those approaches that employed pathway information, the mean gene BMD was used to estimate the pathway BMD t . To estimate the BMD t for each approach, the mean across pathways or the mean across genes was used for each bootstrap. A total of 2000 bootstraps were used to approximate the distribution for each of the 11 approaches. The mean values were used as the transcriptional POD for each approach (except Approach 11) in the subsequent analyses in the manuscript because the mean values had a lower variance than the median values.

Correlation analysis
Pearson's correlations were estimated using the R statistical package. In this analysis, BMD and BMDL values in ppm were converted to milligrams per kilogram-day using strain-specific subchronic food intake factors (female F344 rats 0.113 and male F344 0.1) calculated based on recommended biological values from the EPA (1988). Linear associations between the NOAEL, LOAEL, and the apical endpoint were visualized using scatterplots. The one-to-one line and the 95% confidence curves were also displayed for each approach. These results were visualized using scatterplots.

Likelihood ratio test
A likelihood test was used to test each approach to determine whether it was statistically significantly different from the 1:1 ideal line (the null model). The likelihood ratio statistic is the difference in the likelihood function under the alternative hypothesis and the likelihood function under the null hypothesis; this difference is then multiplied by 2. The likelihood statistic is distributed as a Chi-square distribution with the degrees of freedom equal to the difference in dimensionality of the parameter space under the alternative and null hypothesis. In this analysis, the degrees of freedom is 2. The null model was rejected if the p value was <0.05.

Three criteria for assessing approaches
We applied three criteria to assess the effectiveness of the approaches in identifying a relevant POD: (1) the mean ratio of the BMD(L) t derived from an approach should be less than threefold the apical POD; (2) significance of the Pearson's correlation coefficient (p value) should be <0.05; and (3) the significance of the likelihood ratio test (p value) in deviating from the 1:1 slope should be >0.05.

Results and discussion
Apical data BMD(L) a values were calculated for changes in target organ weight and histology (Table 3). For each chemical, the lowest BMD a value for these apical endpoints within a time point and across all time points was determined ( Figures S2, S3). Because we are not using these data for human health risk assessment, no additional considerations were made (e.g., the relevance of apical endpoint to human health, reversibility of effect, the impact of the allometric exponent). Generally, the BMD a values decreased over time. The lowest BMD a values were observed at 90 days for TBB, TCP, NDPA, and HZB, and 28 days for BB and MDA. The lowest BMD a s across all time points by our calculations were: 4.9 mkd for TBB (hepatocyte hypertrophy), 1567 ppm for NDPA (diffuse transitional epithelial hyperplasia), 55 ppm for HZB (bile duct duplication), 47 mkd for BB (increase in absolute liver weight), 4.5 mkd for TCP (hepatocyte hypertrophy), and 43 ppm for MDA (follicular cell hypertrophy). The lowest BMD a s across all time points for TBB, NDPA, and HZB were similar to the NOAEL values (5 mkd, 1000, and 80 ppm, respectively) reported in previous studies (Dodd et al. 2012b(Dodd et al. , d, 2013a. However, the BMD a values for BB, TCP, and MDA were approximately fourfold, twofold, and fourfold lower than NOAEL values (200, 10 mkd, and 200 ppm, respectively) (Dodd et al. 2012a(Dodd et al. , c, 2013b. We note that no histopathological changes were observed for HZB at the 5-, 14-, and 28-day time points (Dodd et al. 2012b); thus, BMD a values were not calculated. We used BMD a s as apical PODs, but also considered previously published NOAEL and LOAEL values.
Although BMD a s offer several advantages over NOAELs or LOAELs [e.g., less dependence on dose spacing, statistical criteria, and making full use of the information on the shape of the dose-response curve (EFSA 2009)], BMD a analyses also have some limitations. For example, severity of the effect observed in apical endpoints (e.g., histopathology observations in Dodd et al. papers were generally ranked 1, 2, 3, 4, or 5 representing a minimal, light/mild, moderate, moderately severe, or severe/high incidence), which is an important factor for NOAEL and LOAEL, is not considered in BMD a derivation. Thus, we decided to compare BMD t s derived from transcriptional data to NOAEL, LOAEL, as well as the lowest BMD a at the matched time point and the lowest BMD a overall of all of the apical data.

General characteristics
The number of significantly differentially expressed genes (FC ≥1.5 and FDR p ≤ 0.05) at each time point generally increased in a dose-dependent manner, but did not appear to be time dependent (Table 4). Lists of differentially expressed genes and the BMD(L) t values in response to TBB, BB, TCP, MDA, NDPA, and HZB for all time points are shown in Tables S1-S24.

Visual inspection of the distributions of transcriptional and apical BMDs
Eight approaches to selecting gene and molecular pathway BMD t s for derivation of a transcriptional POD were applied as well as three previously proposed approaches (approaches summarized in Table 1). The mean BMD t s of genes and pathways for each approach were calculated (all gene BMDs available in Tables S1-S24; data available upon request for each approach in 43 supplementary tables). Distributions of BMD t s for each approach were visualized by box and whisker plots (Figs. 2, S4). Figure 2 includes horizontal colored lines to indicate the candidate POD values for apical endpoints that would be considered in a risk assessment, including the cancer BMD a (when available), NOAEL and LOAEL values for apical endpoints measured in the rodents in this study, and the lowest BMD a s for these animals. Visual inspection of Fig. 2 suggests a high degree of overlap in the BMD t s and ranges for each of the approaches despite being drawn from, in many cases, very different gene lists. Approach 9 (the significantly enriched pathway with the lowest BMD t ) generally produces the lowest BMD t s, but also appears to have the broadest interquartile ranges, suggesting a higher degree of variability and uncertainty in applying this approach. In contrast, Approaches 10 and 11 (mean and median of all pathways) yield BMD t s that are somewhat higher (with tighter distributions), but are remarkably similar to the majority of the other approaches, despite representing the entire selection of pathway BMD t s rather than the most statistically significant, or lowest BMD t s. It was not possible to apply Approach 1 or 2 to HZB at the 5-day time point because there were no pathways that were significantly enriched in IPA, indicating a limitation of this approach. Approach 4, based on the 20 genes with the greatest fold changes, tended to have lower BMD t s than the other approaches. We note that coefficient of variation (CV) values (Tables S25, S26) were below 0.2 for all approaches except 9, which indicates relatively low dispersion of data points around the mean BMD t of these approaches. Overall, visual inspection suggests that the majority of the approaches yield comparable BMD t values, within tenfold of the corresponding BMDa, that are largely consistent with the various PODs derived from apical endpoint analysis. Below we explore the relationship of the BMD t s derived in each of the approaches to BMD a s in more detail.

Relationship between transcriptional and apical endpoint BMDs
The BMD t s derived from each approach were divided by the POD values (NOAEL, LOAEL, lowest BMD a at matched time point, and lowest BMD a overall) for each chemical to explore the relationship between transcriptomic-based PODs derived from each approach to apical PODs (Figs. 3, S5-S8). From Fig. 3, it is evident that the majority of BMD t s fall within threefold of the NOAEL and LOAEL (Fig. 3a, b). We compared the BMD t at 5 days to the lowest overall BMD a to see how predictive early BMD t s would be for later apical effects (Fig. 3a, b). It is worth noting that the BMD a s generally decline over time ( Figure S9), and the BMD a at day 5 is always greater than the lowest BMD a across all times as expected ( Figure S10). We find that the BMD t s across all time points are somewhat higher than the lowest BMD a from the target tissue across all time points, but, nonetheless, are generally still within tenfold.
The correlation between log-transformed BMD(L) t s derived from each of the approaches and the log-transformed apical POD values (including lowest BMD a , NOAEL, LOAEL, and the time-point-matched lowest BMD a s) were also calculated to determine coefficient correlations (r) and linear significance (p values) in order to determine the extent of correlation between transcriptional and apical data. Figure S11 shows the 5-day time point for this correlation analysis; the other time points are shown in Figures S12, S13, and S14; Tables 5, S27 and S29. The linear relationship between the apical POD values and BMD(L) t s for the 11 approaches was tested to assess whether it approached a 1:1 relationship using the likelihood ratio test (Tables 5, S27-S29). Overall, we found strong correlations between BMD(L) t values derived from each approach and apical POD values. The r values for all approaches (264 r values were derived) were within the range of 0.54-0.99 across the dataset (67, 67, 84, and 18% of the correlations were statistically significant at 5, 14, 28, and 90 days, respectively). Assessment of the BMD(L) t values derived from the 11 approaches clearly showed that the transcriptional data were closely aligned with LOAELs, followed by NOAELs, and then the timepoint-matched lowest BMD a s. The linear regression models for most of the transcriptional approaches were not significantly different from a 1:1 relationship with the apical data.
We applied three criteria to assess the effectiveness of the approaches in identifying a relevant POD. For simplicity, because of the size of the dataset, we have focussed on presenting the results from the 5-day time point for each chemical in more detail and place these findings in the context of the other time points (Table 5). The complete results  Fig. 3 BMD(L) t s relative to apical PODs for the 5-day time point. Threefold and tenfold ranges from the apical POD are within the shaded area and the dashed horizontal lines, respectively. a The BMD t s derived from each approach were divided by the corresponding apical POD values for every chemical; b data from a shown separately for each chemical; c the BMDL t s derived from each approach were divided by NOAEL and LOAEL; d data from c shown separately for each chemical obtained from 14-, 28-, and 90-day time points are found in the supplementary materials (Tables S27-29).

BMD(L) t s compared to the NOAEL for apical effects
On day 5, the maximal differences between BMD t values for each chemical and the NOAEL derived from analysis of apical endpoints were less than tenfold for 61 of the 64 datasets (11 approaches × 6 chemicals − 2* = 64; *Approaches 1 and 2 could be performed for five chemicals only); analysis of TCP using Approaches 3, 10, and 11 yielded BMD t s that were greater than tenfold the NOAEL (Fig. 3a, b). Indeed, the BMD t /NOAEL ratios for the majority of the approaches (42 out of 64 data points) were smaller Check marks (√) and exclamation marks (!) indicate whether an approach met or failed to meet the criteria, respectively than 3. Analysis of the average of this ratio across all of the chemicals for every approach revealed that the BMD t mean derived using Approach 4 was closest to the NOAEL (twofold greater than the NOAEL; Fig. 3a). In addition, a significant correlation between the BMD t mean values derived using Approach 4 and the NOAEL (r = 0.84, p < 0.05) was found ( Figure S11; Table 5-panel A). Moreover, BMD t s derived using Approach 4 plotted against apical NOAEL values was not significantly different from the 1:1 (ideal) line (Table 5-panel A; Figure S11). Based on the three criteria, Approach 4 is thus the most effective at predicting the NOAEL at the 5-day time point, followed by Approaches 1, 2, 7, 8, and 9 that meet two criteria each. The results for the other time points were largely consistent with the results at 5 days. The majority of BMD t s were within tenfold, and generally within threefold, of the apical NOAELs (Figures S5, S6). Approaches 10 and 11 were again marginally higher than the other approaches for a few data points, and Approaches 4 and 9 were generally closest to the NOAEL. The specific number of data points (e.g., for each chemical and each approach = 66 data points) within threefold at 14, 28, and 90 days were 48/66, 45/66, and 47/66, respectively. Average correlation coefficients for all approaches in the 14, 28, and 90 day datasets were 0.85, 0.82, and 0.77, respectively, similar to average r values for 5-day time points (r = 0.83). Based on our three criteria, Approaches 1, 4, 5, 6, and 7 at 14 days, Approaches 1, 7, and 9 at 28 days, and Approach 4 at 90 days were effective in predicting NOAELs.

BMD(L) t s compared to the LOAEL for apical effects
At the 5-day time point, the BMD t values for all of the approaches for all six of the chemicals were within threefold of LOAEL values, and predominantly within threefold of the LOAELs as well. Only 8 of the 64 data points were outside of the threefold range (Fig. 3a, b). In general, the average BMD t s were remarkably similar to the LOAELs across all of the approaches. The average BMD t s for Approaches 4 and 9 were slightly below the LOAEL. Indeed, the BMD t mean values for all data points in Approaches 1, 2, 7, and 8 for all chemicals were within threefold of the LOAEL (Fig. 3). The averages of the BMD t /LOAEL ratios for the six chemicals were very close to 1 in all cases, with Approach 9 having the largest divergence from 1 (ratio = 1.0; CV = 0.318). In general, all approaches except 3, 9, 10, and 11 were significantly correlated with the LOAEL (Pearson's correlation). The largest and smallest correlation coefficients were found for Approach 2 (r = 0.97; p value = 0.005) and Approach 9 (r = 0.65; p value = 0.16), respectively (Table 5-panel B; Figure S11). The likelihood ratio test results showed transcriptional-to-LOAEL linear regression models for all approaches were not significantly different from 1:1, except for Approach 2. Based on the three criteria, BMD t values derived from Approaches 1, 4, 5, 6, 7, and 8 most accurately predict the LOAEL at the 5-day time point (Tables 5-panel B The results for the other time points were largely consistent with the results at 5 days ( Figures S5, S6). All of the BMD t s were within tenfold of the LOAELs, and generally within threefold of the LOAELs. The number of data points within threefold of LOAELs for the 14-, 28-and 90-day datasets was 55/66, 60/66, and 50/66, respectively. Indeed, at 14, 28, and 90 days the means of the BMD t /LOAEL ratios for all approaches were very close to 1. The maximum and minimum correlation coefficients between the approaches and their matched LOAELs were 0.91 and 0.77 in the 14-day datasets, 0.87 and 0.77 in the 28-day datasets, and 0.88 and 0.70 in the 90-day datasets. The likelihood ratio test results showed transcriptional-to-LOAEL linear regression models for all approaches in the 14-, 28-, 90-day time points were not significantly different from 1:1. Our results show that seven, ten, and three of the approaches met our three criteria and may be recommended for predicting the LOAEL for the 14-, 28-, and 90-day time points (Tables 5-panel B, 6-panel B, S27-S29).
For the 5-day time point, the BMDL t values for all approaches were within tenfold of the LOAEL (except for NDPA, Approach 9), and only 10 of the 64 data points were outside of the threefold range ( Figures S7, S8). All of the data points from Approaches 1, 2, 5, 6, and 7 were within threefold of the LOAEL. All approaches were highly correlated with the LOAEL (Table 5-panel F). Similar to the BMD t analysis, the BMDL t values for Approaches 2 (r = 0.97; p value = 0.001) and 9 (r = 0.55; p value = 0.25) had the highest and lowest correlation coefficients with apical data. The BMDL t values for Approaches 2, 3, 5, 6, 7, and 8 were closest to the LOAEL (Figures S7, S8). The likelihood ratio test results showed no significant difference from the 1:1 line for any of the approaches in the 5-day datasets (Table 5-panel F).
Similar results were found in our analysis of BMDL t s for the other time points. All approaches (except Approach 4 at 14 and 28 days, and Approaches 1, 6, 7, and 9 at 90 days) were within tenfold of the LOAEL, and there were 11/64, 8/66, and 18/66 data points outside the threefold range in the 14-, 28-, and 90-day time points, respectively. The BMDL t s values for all approaches and time points were generally highly correlated with the LOAELs. Out of the 11 approaches, BMDL t s from seven, nine, and three of the approaches in the 14-, 28-, and 90-day time point datasets, respectively, were significantly (p < 0.05) correlated with their respective LOAELs. The maximum and minimum r values were 0.90 and 0.72 at 14 days, 0.87 and Table 6 Assessment of each approach against the three criteria for predicting apical PODs at the 5-, 14-, 28-, and 90-day time points Check marks (√) and exclamation points (!) indicate whether an approach met or did not meet the three criteria, respectively Panel B-BMD t versus LOAEL Panel D-BMD t versus lowest apical BMD a across any time point 0.69 at 28 days, and 0.87 and 0.57 at 90 days (Tables S27-S29). No significant difference from the 1:1 line (BMD t vs. LOAEL) was found, with the exception of Approach 4 (14 days) and Approaches 4 and 9 (90 days). Overall, seven, seven, nine, and two of the approaches in the 5, 14, 28, and 90 day datasets (respectively) met the three criteria and thus may be recommended for predicting the LOAEL (Tables 5-panel F, 6-panel F, S27-S29).

BMD t s compared to time-point-matched lowest BMD a
The results of current study show that BMD t values derived from transcriptomic data for most of the approaches were remarkably close to the lowest BMD a at 5 days. The average ratios of BMD t -to-BMD a values at the 5-day time point for all approaches were <1.5, with Approach 9 having the largest divergence from 1 (Table 5-panel C; Fig. 3). BMD t s for all approaches on day 5, with one exception (Approach 9, NDPA), were within threefold of the lowest BMD a at the 5-day time point (Fig. 3). These results suggest that at the 5-day time point similar doses are required to trigger both transcriptional and apical responses. The BMD t s for all approaches, except Approach 9, were significantly correlated with the lowest BMD a s at 5 days. The maximum and minimum correlation coefficients were 0.98 (p < 0.01) for Approaches 10 and 11, and 0.81 (p = 0.1) for Approach 9 (Table 5-panel C; Fig. 3). There were no significant deviations from the 1:1 line for transcriptionalto-apical comparisons, except for Approaches 10 and 11 (Table 5-panel C). Thus, all approaches except 9, 10, and 11 meet the criteria and are effective predictors of timepoint-matched BMD a s (Table 6-panel C). Analysis of the other time points was largely consistent with the results at 5 days ( Figures S5, S6). The overwhelming majority of BMD t s at 14 days were within tenfold and generally within threefold of the lowest time-matched BMD a , with no greater than a 2.6-fold mean BMD t /BMD a ratio for any approach. For the 28-and 90-day time points, the mean BMD t /BMD a ratios for all 11 approaches were >3, with the exception of Approach 4. BMD t s for 51 of the 55 and 55 of the 66 approaches were within tenfold of BMD a s at the 28-and 90-day time points, respectively. It should be noted that all BMD t /BMD a ratios that yielded a value >10 were derived from TCP. These results indicate (for the current chemicals) that gene alterations and apical effects occurred at similar doses on day 5; however, at later time points, time-matched apical responses generally occurred at somewhat lower doses than transcriptional responses (Tables S27-S29). However, we note that the vast majority of the BMD t s even at the 90-day time point were within tenfold of apical PODs, suggesting their utility even at 90 days. Similar to 5 days, all approaches except 9, 10, and 11 meet the criteria for day 14 datasets. However, only Approach 4 met all three criteria at the 28-and 90-day time points (Table 6-panel C). Thomas et al. (2013a, b) proposed the use of the lowest pathway BMD t for prediction of BMD a s. Using this approach, a strong correlation (r > 0.90) was found between BMD t s and the lowest time-point-matched BMD a values for adverse apical endpoints at 5-, 14-, 28-, and 90-day time points for these same six chemicals ). In the current study, two main modifications were made to this method (Approach 9). The modifications were to: (1) apply the analysis only to ANOVA pre-filtered genes (p < 0.05 in at least one dose group); and (2) to remove pathways that were not significantly enriched following BMD t analysis (Fisher's exact test; p ≤ 0.05). We directly compared our BMD t s to those published in Thomas et al. (2013a, b) (Figure S15) and found that they were highly comparable. Specifically, our results were similar to those of Thomas et al. (2013a, b) at the 5-, 14-, and 28-day time points (r = 0.81, 0.81, 0.97, respectively), but not at 90 days (r = 0.61). These results suggest that addition of a Fisher's exact test to this Approach did not improve relationships with BMD a s. Indeed, the linear relationship between BMD t s derived from the lowest transcriptional pathway and its time-point-matched apical endpoint decreased due to this additional filtering. Because the lowest pathway BMD t may in certain cases be based on only a handful of genes (depending on the minimum number of genes required in a pathway for BMD t consideration that is applied), we felt this additional filtering ensured a more robust and responsive set of genes for this application. Thus, despite the weaker correlation, we advise the application of this filter should this approach be applied.

BMD t s compared to the lowest BMD a across all time points
At the 5-day time point, the ratio of the BMD t s to the lowest overall BMD a (across all time points) tended to be high. However, the majority of the BMD t s were within tenfold of the lowest BMD a s at the 5-day time point (Fig. 3); of the 64 data points, only five were below 1. Seven ratios were >10, but these were spread across approaches and were all for the chemical TCP (Fig. 3b). The correlations between the mean BMD t s for Approaches 1, 2, 4, 5, 6, 7, and 8 and the lowest overall BMD a s were significant (p < 0.05), with r > 0.86 ( Figure S11). The results of the likelihood ratio test showed that at the 5-day time point there were no significant differences between the 1:1 line and transcriptional-to-apical lines for Approaches 3, 4, and 9 (Table 5-panel D). However, no approach met all three of our selection criteria at the 5-day time point (Tables 5-panel D , 6-panel D). Thus, BMD t s are the least effective in predicting the lowest BMD a relative to all of the other apical PODs assessed on day 5.
Analyses of the other time points were similar to the above. Unsurprisingly, the ratios tended to be larger than 1 for BMD t s relative to the lowest BMD a , although the BMD t s across all approaches were predominantly within tenfold of the BMD a . The maximum and minimum correlation coefficients (r value) were 0.94 and 0.71 for 14 days, 0.94 and 0.84 for 28 days, and 0.76 and 0.63 for 90 days (Tables S27-S29). There was no significant difference between a 1:1 line and the transcriptional-to-apical lines for Approach 9 (14 days), Approach 4 (28 days), and Approaches 1, 3, 4, 7, and 9 (90 days). However, only Approach 4 (28 days) met all three of our criteria for prediction of the lowest BMD a across all time points .
Overall, we found that while the majority of the approaches were effective predictors of apical PODs at the 5-, 14-, and 28-day time point, only three approaches (Approaches 4, 5, and 8) met our three criteria at the 90-day time point (Table 6). Approaches 9, 10, and 11 only met the three criteria at the 28-day time point to estimate NOAEL and LOAEL. Approach 9 tended to have larger variation than the other approaches, and Approaches 10 and 11 (mean and median overall pathway BMD t s, respectively) tended to overestimate POD as expected. Across the entire dataset, the BMD t s were best at predicting the LOAEL and the lowest time-point-matched BMD a , whereas BMDL t s were relatively equal in being effective predictors of LOAEL and NOAEL.

Overall comparison between transcriptional BMD(L) t s and apical PODs
We estimated the potential ability of the approaches to predict apical POD at each time point based on the sum of the criteria met by BMD t -or BMDL t -derived for each approach against apical PODs (Table 7). These results suggest that Approach 7 followed by 1 was the best predictor at 5, 14, and 28 days. At 90 days, however, Approach 4 was the best method to predict apical PODs. In addition, Approach 4 was the best on day 5 and yielded the highest score overall.

Transcriptional BMD t s compared to the tumor responses
Tumor responses in thyroid, bladder, and liver, respectively, were reported in rodents following MDA, NDPA, and HZB exposure (NTP 1978(NTP , 1979a. Incidences of tumor development were analyzed in a previous study to obtain cancer BMD a values . We found that BMD t values for all of the approaches at 5, 14, and 28 days (Figure S16-panel A) were within tenfold of the tumor response. BMD t values derived from the 11 approaches for MDA and NDPA were within threefold at these time points ( Figure  S16-panel B). The average ratios of BMD t -to-cancer BMD a values at the 90-day time point for all approaches were <3. While more carcinogenic chemicals are required to calculate a correlation coefficient, based on the results derived from three chemicals in the current study the transcriptomic-derived BMD t values at 90 days were slightly closer to cancer BMD a values than other time points.

Statistical filtering
BMDExpress is equipped with a built-in tool for filtering a dataset and selecting the probe set ). This tool uses a one-way ANOVA together with a FDR correction for multiple comparisons (Benjamini and Hochberg 2007), and the user is allowed to choose between FDR Table 7 Assessment of BMD t and BMDL t derived from each approach against the three criteria for predicting apical PODs at the 5-, 14-, 28-, and 90-day time points BMD t values were assessed for predicting four apical endpoints (NOAEL, LOAEL, lowest BMD a at matched time point of apical endpoints, lowest BMD a overall of apical data); BMDL t values were assessed for predicting two apical endpoint (NOAEL and LOAEL). Sum = (three criteria × four apical PODs × four time points = 48) + (three criteria × two apical PODs × four time points = 24) = 72. The top approaches that met the three criteria are highlighted in bold (there was a three-way tie for second place) p value and p value. A previous study from our laboratory investigated the effects of using statistically filtered data on gene and pathway BMD t s, and more specifically compared FDR p value with ANOVA unadjusted p value (Webster et al. 2015a). The study showed that pre-filtering data in BMDExpress significantly reduced the mean gene and pathway BMD t s. Moreover, transcriptional data that were subjected to more stringent filtering produced BMD t s that were closer to apical PODs. These results are not surprising since stringent filtering reduces false positives and ensures that only genes that truly respond to the treatment are considered. However, more stringent filtering methods also reduce the number of discoveries and increase the probability of not retaining a sufficient number of genes for BMD t modeling. For these reasons, Webster et al. (2015a, b) recommended that at least an ANOVA p ≤ 0.05 filter, which is less stringent than an FDR p value filter, is applied to the data prior to modeling in BMDExpress. Our approaches also applied different statistical tests (MAANOVA FDR p ≤ 0.05 vs. ANOVA p < 0.05) for analyzing and pre-filtering data. Significant genes were selected based on: (1) fold change and corrected p value cutoff (FC ≤1.5; FDR p value <0.05) using MAANOVA in Approaches 1, 2, 4, 5, 6, 7, and 8; and (2) p value ≤0.05 using ANOVA in Approaches 3, 9, 10, and 11. Overall, BMD(L) t derived from our MAANOVA-filtered approaches, which applied an FDR correction and foldchange cutoff, were better at predicting apical PODs (86 out 168 approaches met the three criteria; 51%) than those approaches analyzed using ANOVA (15 out 96 approaches met the three criteria; 16%; Table 6). Thus, our results support more stringent pre-filtering of data for deriving BMD t s.
Based on the three criteria we described above, the BMD(L) t from approaches that applied a more stringent MAANOVA FDR p value ≤0.05 and fold change ≥1.5 for gene expression changes were most effective in predicting NOAEL and LOAEL at the 5-day time point. Moreover, all of the MAANOVA-analyzed approaches, as well as Approach 3 (the 20 lowest pathway BMD t s), met the three criteria for predicting the most sensitive apical endpoint observed on day 5. However, no approach met the three criteria when analyzed against the lowest BMD a endpoint across all time points at 5 days.

Conclusions
We leveraged published Affymetrix DNA microarray data on well-designed time-series and dose-response experiments in rats to evaluate 11 approaches to deriving a BMD t from groups of genes. We assessed the relationship between BMD t s derived using these 11 approaches to PODs derived from apical data that might be used in a human health risk assessment. To evaluate the effectiveness of the approaches in predicting apical PODs, we used three criteria: (1) ratio of BMD t to apical endpoint POD <3; (2) correlation coefficient p value <0.05 for BMD t to apical POD; and (3) likelihood ratio test p value >0.05 for deviation from the 1:1 line for BMD t versus apical POD. We found a very high degree of concordance between all of the approaches for deriving BMD(L) t s and apical PODs. Generally, in our opinion, BMD(L) t values derived using the 11 approaches were remarkably aligned with different apical PODs that may be used in human health risk assessment. The vast majority of BMD t s across all approaches were within tenfold of the various BMD a s and were largely within threefold as well. In general, across the 5-, 14-, 28-, and 90-day time points, eight, eight, eleven, and three approaches met our three criteria, respectively, and thus qualify as effective estimates of apical PODs.
Consistency in the PODs derived from transcriptional endpoints with those derived from standard toxicity endpoints increases confidence in the use of transcriptional PODs in human health risk assessment. However, the relevance of these specific gene expression perturbations to adverse effects is unclear, since they are not based on an MOA-centric approach. The approaches described assume that significant perturbations in gene expression in general may lead to an adverse outcome. Early thought in this field presumed that gene expression changes would occur at lower doses than adverse apical effects and thus would be overly conservative. In contrast, we demonstrate that mean transcriptional PODs from the approaches reviewed herein are generally higher than apical PODs, suggesting that this is not the case. Indeed, the mean transcriptional BMDs differ from the corresponding apical endpoints by less than1.5-fold for matched time points (Table 5-panel C) or are within tenfold across all time points (Table 5- panel D). These observations suggest that for these chemicals transcriptional changes do not occur at lower doses than apical responses, alleviating concerns that transcriptomics approaches in risk assessment would be overly protective.
Although additional studies using chemicals targeting different types of adverse effects are required to validate our findings, our results suggest that transcriptional response can be used as an efficient alternative approach for POD selection in chemical risk assessment. Transcriptional PODs were furthest from apical PODs at the 90-day time, suggesting that some dampening of transcriptional response may be occurring at later time points, and supporting the use of earlier time points to identify doses that significantly impact transcriptional profiles along a trajectory toward disease. Our results indicate that transcriptionally derived POD estimation from a short-term study are consistently within tenfold of PODs derived from apical endpoints from longer term studies. We also support previous findings that a more conservative statistical filter yields transcriptional PODs that are more aligned with apical PODs. However, our results suggest that any of the proposed approaches should produce transcriptional PODs that are within tenfold of the non-cancer and cancer BMD a in target tissues. While Approaches 7, 1, and 4 appear to be the best predictors of apical PODs, decisions on which approach is used could be determined based on the formulated risk assessment question and/or any guidelines established by the regulatory agency undertaking the evaluation if they exist (e.g., Bourdon-Lacombe et al. 2015). Overall, our paper suggests that transcriptional analyses produce highly robust BMD t metrics that are strong predictors of apical PODs within an acceptable measure of uncertainty (generally less than tenfold).
The integration of toxicogenomics into human health risk assessment is a relatively new area, and no international guidelines on this have been established. Major challenges include selecting the most relevant genes or pathways to pathophysiological effects. Addressing this issue will assist the integration of genomic data into chemical risk assessment. Previous work from our and other laboratories have provided initial examples of how toxicogenomics can be used to inform human health risk assessment, including both hazard identification and dose-response analysis (e.g., Andersen et al. 2010;Bercu et al. 2010;Bhat et al. 2013;Bourdon-Lacombe et al. 2015;Chepelev et al. 2015aChepelev et al. , b, 2016Clewell et al. 2011;Cote et al. 2016;Jackson et al. 2014;Moffat et al. 2015;Thomas et al. 2011Thomas et al. , 2012. It has been noted that toxicogenomics-guided elucidation of MOA information is helpful, as it can guide the selection of MOA-relevant key events and hence identification of relevant toxicogenomics data that should be used in dose-response analyses (e.g., Moffat et al. 2015). However, here we also support the previous contention that toxicogenomics data can be used for dose-response analyses even without detailed MOA information, especially in the case of "non-selective" chemicals (i.e., those interacting with multiple biological targets, and, hence, having the potential to contribute to several MOAs) ). Indeed, identifying MOA from toxicogenomics data can be time-consuming, as compounds may operate through several mechanisms perturbing an array of biological effects (Bercu et al. 2010). Therefore, it has been suggested that the lowest dose at which transcriptional perturbations become measurable (e.g., the lowest pathway BMD t ) should be considered for risk assessment . In this study, we demonstrate that a variety of approaches are amenable to deriving a highly reproducible transcriptional POD, thus not requiring regulatory agencies to select a single approach.