The reply of Waalkes et al. (2014a) to our concerns (Cohen et al. 2014) regarding their publication in Archives of Toxicology (Waalkes et al. 2014b) did not address the key issues. Their study reported an unusual dose–response relationship with considerable mortality and with tumor incidence results in contrast to their previous study (Tokar et al. 2011) that could not be explained. Waalkes et al. (2014b) state in the manuscript that, “… these results should be interpreted with great care. For one, the reason for the absence of a typical dose–response for lung tumor formation is unknown and requires thoughtful scrutiny and confirmation and further study.” The rational explanation for the “unusual dose–response” which is readily apparent when comparing their two papers lies with the historical controls: There is no treatment-related effect.

Ignoring historical controls in carcinogenicity studies is inappropriate. Unquestionably, the concurrent control is very important, but as stated by Elmore and Peddada (2009), “To assess if the tumor responses in the current study are unusual in comparison to what is known historically about the lesion among control animals, it is customary for researchers to compare the responses in the current study with the tumor incidences in control groups from previous studies. Historical control data (HCD) is the term used for this compilation of data from previous studies. Thus, the HCD can be used to determine if the tumor incidence in the concurrent control group or dual control groups are consistent with the tumor incidence in historical control groups. Comparison to the tumor incidence rates in treated groups with both concurrent control groups and HCD can, along with other study data such as the incidence of other lesions of similar cell lineage, help to determine biological relevance.” Historical control data are routinely used by regulatory agencies in the assessment of chronic toxicity and carcinogenicity data. For example, according to the FDA (2001): “Historical control data can also be used as a quality control mechanism for a carcinogenicity experiment by assessing the reasonableness of the spontaneous tumor rates in the concurrent control group (Haseman 1984; Haseman et al. 1984) and for evaluation of disparate findings in dual concurrent controls.” The use of historical control data, particularly from the same laboratory, is a well-established method used by regulatory agencies and the National Toxicology Program (NTP) for understanding how an animal model performs and for preventing the overinterpretation of results (Keenan et al. 2009).

Waalkes et al. (2014a) are correct that the historical control data from Charles River Laboratories are not the most appropriate tumor data for comparison to the results of this study since it is a different laboratory. However, Waalkes and his colleagues have never provided historical control data from their laboratory for comparison. We used the results from their 2011 paper (Tokar et al. 2011), which we assume was at the same laboratory and under similar conditions for a historical tumor incidence comparison in our letter to the editor.

Lung tumors in this strain of mouse are very common and quite variable (Nikitin et al. 2004; Giknis and Clifford 2005). A comparison of the tumor range from the current and previously reported Waalkes et al. studies with the literature shows that the range of lung tumor incidence overlaps considerably. Thus, the logical conclusion using these data is that the results of their most recent study indicate that there is no treatment-related effect at any of the doses. An additional confounder to the interpretation of the results from this study is the high and variable mortality seen in various groups, the effects of which on tumor incidence cannot be known. Nevertheless, this is a major complicating factor in interpretation of the results, which is why regulatory agencies and the NTP have guidance for survival that must be met to consider the study to be acceptable. While the exception is death due to the tumors, there is no indication that the early mortality in Waalkes et al. (2014b) was tumor-related.

Finally, the suggestion by Waalkes and colleagues that the inconsistent dose–response is indicative of a non-monotonic dose–response is not supported by previous studies with arsenic and does not match with the current knowledge on arsenic effects in animal studies, or, more importantly, with the findings in epidemiology studies, from which there is no evidence of a non-monotonic response for cancer or non-cancer endpoints. The suggestion by the authors regarding this matter is not supported by available data.