Meta-analyses and data from large randomized trials have clearly established that cytotoxic chemotherapy and endocrine therapy induce a statistically significant increase in survival for all breast cancer patients [15]. Nonetheless, the degree of benefit varies among different patient groups [6, 7]. Metastatic breast cancer remains a largely incurable disease; over 40,000 women die of breast cancer each year in the United States alone [8]. The need for a more robust, accurate, cost-effective, and rapid drug development mechanism is clearly evident.

The value of many new drugs (in the broadest context) with respect to increasing life expectancy remains somewhat controversial [9, 10]. Moreover, the emergence of substantially more effective and less toxic new breast cancer therapies has been slow. To some degree, this may reflect the complexity of biological signaling in cancer cells [11]. All existing therapies hit <500 molecular targets [12], suggesting that there are many unexplored targets for drug discovery within the human interactome that comprises possibly 1 million proteins and over 1 trillion potential interconnections. Nonetheless, there are clearly other limitations in drug development. Less than 10% of investigational new drugs for new molecules proceed beyond early development [13]; the approval rate for new oncology drugs is ~5% [14].

Perhaps the lack of significant progress partly reflects the drug development process in which preclinical animal models play a central role. The leading causes of attrition of new drugs are generally cited as being unpredictable toxicities and lack of efficacy, the early identification of which are primary goals for preclinical animal models. Moreover, the most common toxicities are pharmacological in nature [15] and might be expected to be evident in adequate animal models when appropriately used. Preclinical animal models are used primarily to predict the safety and efficacy of investigational drugs prior to their use in humans, reflecting the adherence of most governments to the Nuremberg Code.

This code specifically requires (among various other considerations) that experiments in humans be designed based upon the results of animal experimentation, and that the risk to subjects should not exceed the humanitarian importance of the problem [16]. For a typical phase I clinical trial, the starting dose is usually based upon one-tenth of the maximum tolerated dose (or the severely toxic dose) in the most sensitive preclinical animal model tested. For phase 0 (microdosing) trials, the first-in-human dose is generally estimated as one-fiftieth of the no observable adverse effect level in rats [14].

There are certainly exceptions, but the general ability of a maximum tolerated dose estimate in animal models to provide a reasonable prediction of the maximum tolerated dose in humans is established. This seems to hold for cytotoxic drugs generally [17] and within specific classes of these drugs [18], particularly when dosing is adjusted across species using body surface area (mg drug/m2). For investigational new drugs, the US Federal Drug Administration requires toxicity/safety data from two species (one rodent and one nonrodent). While some human toxicities are overpredicted or underpredicted [19], this approach predicts the nature of toxicity in humans with ~70% concordance [20]. When toxicities are concordant, these arise in humans within 1 month in almost 95% of instances [20]. While preclinical animal models are imperfect, viable alternatives for selecting safe doses for first-in-human studies are not yet evident or likely to become widely accepted in the immediate future.

Preclinical studies also provide the opportunity to measure various pharmacokinetic/pharmacodynamic properties of a drug (absorption, distribution, metabolism, elimination, toxicity). For molecularly targeted therapies, measuring tissue concentrations and whether the test drug successfully modulates the target (or a surrogate biomarker predictive of drug action) are important. These data can also be used to guide their incorporation into the first-in-human studies, a major goal in phase 0 trials for molecular proof of concept [14].

Since efficacy in one or more preclinical animal tumor models generally provides the rationale in support of a new anti-neoplastic drug's probable benefit in humans, is the primary modifiable cause of attrition of new drugs a consequence of poor prediction of human activity/potency from animal models? For many years, the primary National Cancer Institute (USA) in vivo screen for anticancer drug activity used the L1210 and P388 murine leukemia models, which appear to have no direct biological relationship to human breast cancer. Nevertheless, several of the cytotoxic drugs used in breast cancer were developed while this was a key component of the preclinical animal model efficacy screen. This apparent utility probably reflects its use primarily for the screening of drugs targeted at the generic properties of DNA replication and proliferation, rather than specific molecular processes driving breast cancer.

Animal models currently available for testing breast cancer drugs include xenografts of human breast cancer cell lines growing in immunodeficient mice [21, 22], chemically induced mouse models (for example, 7,12-dimethylbenzanthracene, N-nitrosomethylurea), virally induced mouse models (for example, mouse mammary tumor virus, polyomavirus), and genetically manipulated mouse models [23]. These models should be better suited for the development of molecularly targeted drugs that may have greater specificity/activity in breast cancer. Nonetheless, almost two decades of the use of these models has yet to improve fundamentally the rate at which new breast cancer drugs are successfully moved from the laboratory into clinical practice.

Is the high attrition rate of new drugs a function of the use of the wrong models, poor use of the right models, and/or lack of adequate models? Forcing the wrong question onto an inappropriate model greatly increases the likelihood that the data will be misinterpreted [22]. Selecting the most appropriate preclinical animal models for molecularly targeted therapies may appear less challenging, provided we have an adequate understanding of the nature of the target's biology. Such models, however, may also overestimate sensitivity as they are being driven (potentially exclusively) by the molecular target under investigation, a target that may not exhibit this functional relationship with a comparable prevalence in human breast tumors. For example, endocrine therapy has proven very successful and yet only 50% of all estrogen receptor-positive breast cancers respond to endocrine therapies [24]. A lack of understanding of the diversity of signaling and its redundant and degenerate properties may lead to the development of molecular targeted therapies that fail to live up to their full potential.

Breast cancer is a highly heterogeneous disease; heterogeneity is often evident even within the same tumor. Cell line xenografts and genetically manipulated mouse models are more homogeneous. No single model may therefore adequately reflect the heterogeneity, and drug responsiveness, of any breast cancer subgroup. The relative homogeneity of these models may render them overpredictive or underpredictive, depending on how prevalent their phenotype is in the human disease. A prediction of high sensitivity often leads to a drug being considered a strong target for human testing. Overpredicted drugs would show limited activity/potency in human efficacy trials and so (after significant investment) experience a high attrition rate if the animal models were too sensitive. While current breast cancer models may well overpredict sensitivity relative to the human disease, it is difficult to assess underprediction because the lack of activity in preclinical animal models could lead them to be dropped early. More models (individual or panels of models) that are more representative of the heterogeneity of the human disease are probably needed. For example, relatively few breast cancer models show clinically relevant patterns of spontaneous metastasis from primary tumors.

Once a drug is approved for a specific disease, it may be used for other indications. This off-label usage is increasing and is potentially problematic [25], particularly if applied without adequate prior experimentation. The failure (poor activity, unpredictable toxicity) of some new combinations may reflect a lack of rigorous preclinical investigation. Since there is no a priori requirement for animal efficacy studies, clinical trials may be designed primarily from in vitro data where adverse pharmacokinetic, pharmacodynamic, toxicologic, and/or molecular feedback signaling interactions may be inadequately modeled. Adverse interactions could be missed, such as increased toxicity and/or reduced therapeutic efficacy.

Conclusions and comments

The use of animal models for safety testing of investigational drugs may be imperfect but is likely to continue for the foreseeable future. While the use of preclinical animal models for efficacy assessment will probably also persist, it is less clear that current preclinical animal models are entirely adequate or appropriately used (even when adequate), or that an optimal panel of such models to predict efficacy with sufficient accuracy currently exists.

Certainly, the limitations with respect to potential over-prediction/underprediction of existing models should be carefully considered during efficacy studies and in the decision to proceed to studies in humans. It is also possible that greater care is needed in the design of preclinical studies. For example, is the choice of model most appropriate and/or should multiple models be used, is orthotopic rather than subcutaneous inoculation required for xenografts, might the nature of the immunedeficient phenotype of xenografts hosts affect drug action, does the choice of species generate functionally different metabolite profiles from humans, and what are the most appropriate and rigorous endpoints for assessing efficacy [21, 22]? For off-label use/combinations, data from adequate preclinical modeling are strongly encouraged and should, perhaps, be formally required to guide the design of clinical trials (by institutional review boards if not by governments). A more accurate (predictive) preclinical animal model screening has the potential to reduce the cost, and increase the pace, of successful drug development for breast cancer.