“Unsettling circularity”: Clinical trial enrichment and the evidentiary politics of chronic pain

The US Food and Drug Administration’s approval of the controversial analgesic Zohydro has drawn attention to its endorsement of so-called ‘enrichment strategies’ for streamlining clinical trials. Among these is ‘enriched enrollment randomized withdrawal’ (EERW), a design intended to improve measures of drug efficacy by screening out patients who are non-responsive or suffer adverse effects. EERW has been promoted as a response to the problem of high trial failure rates for drugs that were previously thought to be effective for pain management. This article uses EERW as a window into the evidentiary politics of pain medicine in the twenty-first century, against the backdrop of concerns about prescription opioid abuse. We demonstrate that the reframing of negative trials as ‘failed’ trials poses a challenge to the evidence hierarchy of evidence-based medicine, and identify several rhetorical strategies used by proponents to normalize EERW by placing it in continuity with routine clinical trial practice and the promise of personalized medicine. Finally, we illustrate how EERW, in the current regulatory context, fails to contribute to the individualization of diagnosis or therapy, and reinforces the perceived gulf between trials for regulatory approval and clinical practice.


Introduction
Allegations have been raised that a new, scientifically questionable methodology for drug approval was created at these pay-to-play meetings. If true, we have an alarming explanation for the indefensible decision of the FDA to approve Zohydro despite the FDA's own Anesthetic and Analgesic Drug Products Advisory Committee voting 11-2 against approval. (Manchin, 2014a) In March of 2014, United States Senator Joe Manchin (D-West Virginia) published an open letter to Department of Health and Human Services Secretary Kathleen Sebelius, requesting that she reverse the US Food and Drug Administration's (FDA) recent approval of Zohydro ER, a long-acting analgesic containing hydrocodone, an opioid similar to codeine and morphine. The FDA's approval of the drug for severe chronic pain 6 months earlier had provoked a number of physicians, politicians and citizen watchdog groups to call for Zohydro's withdrawal from the market. Several bills restricting or outlawing the drug were proposed at State and Federal levels, and a coalition of activists called for the resignation of FDA Commissioner Margaret Hamburg (Burton, 2014;Frei, 2014;Manchin, 2014a;Smoot, 2014;Valencia, 2014). In the context of increasing rates of prescription opioid-related mortality, with hydrocodone products among the most widely prescribed (FDA, 2013b;King et al, 2014), many commentators took issue with the FDA's approval of a powerful hydrocodone-based opioid that did not posses the abusedeterrent properties of other drugs in its class. 1 Several pointed out that the FDA had granted approval against the recommendations of its own external expert advisory committee, which voted 11-2 against the drug (FDA, 2013b). At their most diplomatic, critics accused the FDA of sending mixed messages about opioid safety; others likened Zohydro to "heroin in a pill", the approval of which demonstrated that the FDA was firmly in the pocket of the pharmaceutical industry (Geller, 2014;New York State Assembly, 2014). The public controversy was exacerbated by reports alleging a "pay-to-play" arrangement whereby pharmaceutical and medical device manufacturers paid to send representatives to private meetings with FDA officials. The same month as Zohydro's approval, the Milwaukee Journal Sentinel and Washington Post published exposés of a University of Rochester-based organization called the Initiative on Methods, Measurement, and Pain Assessment in Clinical Trials (IMMPACT). During the preceding 10 years, IMMPACT and its co-chairs received hundreds of thousands of dollars from industry sources to facilitate meetings between the FDA and various firms, including Zohydro's then manufacturer Elan, to discuss improving and streamlining clinical trials for analgesics (Fauber, 2013;Whoriskey, 2013). Many interpreted the Zohydro approval as confirmation of the negative influence of the IMMPACT meetings and, by extension, of the pharmaceutical industry on regulatory decision making (Kivela, 2013;Fiore, 2014;Manchin, 2014a, b). Several of these accounts referred to a new clinical trial design strategy that had emerged from the IMMPACT consultations and been employed in the Zohydro clinical trials. Critics contended that so-called 'enriched enrollment' designs allowed companies to magnify the "Unsettling circularity" apparent effectiveness of a given drug by weeding out non-responders and patients who cannot tolerate adverse events, thus making it easier to win FDA approval (Kivela, 2013;Fiore, 2014;Ryan, 2014).
In this article we use the controversy surrounding Zohydro as a window into the evidentiary politics of opioid analgesics for chronic pain at the turn of the twenty-first century. Rather than Zohydro itself, our main foci are the discourse and practices of clinical trial 'enrichment'specifically, enriched enrollment randomized withdrawal (EERW)as they are elaborated and endorsed across scientific and regulatory publications. EERW has been depicted in the press as a "controversial" and "scientifically questionable" methodology promoted by pharmaceutical companies to fast-track dubious drugs to market. It is not our aim to either support or refute this analysis. We are interested instead in EERW's relationship to existing evidentiary and regulatory infrastructures surrounding chronic pain. We contend that, in mobilizing claims about the 'failure' of clinical trials for chronic pain therapy, proponents of EERW present subtle challenges to the conventional wisdom regarding clinical trial design and evidence-based medicine (EBM). We examine a set of rhetorical strategies that situate EERW as continuous with both routinely accepted research practice and the promise of pharmacogenomics and personalized medicine. Both pharmacogenomics and EERW rest upon the assumption that heterogeneity in patients' drug response is caused by the existence of distinct patient subtypes which, once identified and elaborated, will provide the grounds for individualization of diagnosis and therapy (Hedgecoe, 2004;Lakoff, 2008;Dworkin et al, 2010). Although proponents of EERW designs for analgesic trials mobilize this rhetoric of heterogeneity and subtyping, in practice such trials do not actually contribute to the elucidation of this heterogeneity, or to the personalization of treatment. EERW, under the regulatory framework outlined by the FDA, explicitly prioritizes internal validity over external validity and the generalizability of trial results, which in the context of pain medicine has serious implications for evaluations of safety and effectiveness. As such, we contend that EERW in its present iteration, rather than contributing to the individualization of treatment and diagnosis, suggests a widening of the existing gulf between RCTs for regulatory approval and actual clinical practice.
In this article, we describe how EERW has been mobilized by various actors in the scientific and regulatory literature. EERW sits at the nexus of multiple interests and influences. These include pharmaceutical companies seeking to streamline drug approval, but also pain specialists concerned about the fit of conventional trial designs for poorly understood chronic pain conditions, regulators attempting to address the lag in development of novel therapies, clinical trials experts trying to strengthen internal validity, and actors across various fields enthusiastic about the promise of personalized medicine.

Background: The Road to Enrichment
EERW is one of a number of different enrichment strategies for clinical trials designed to help researchers identify the study population best able to demonstrate the effects of a drug (Temple, 2013). EERW employs a two-stage design. The first stage is an uncontrolled, prerandomization open-label period during which both patients and experimenters are aware that the test drug is being administered. All subjects are administered the test drug at a flexible dose titrated to individual effectiveness, typically a pre-specified measure of improvement (Farrar et al, 2001;Hewitt et al, 2011;FDA, 2014c). Patients who do not respond adequately to the drug or experience severe adverse events are removed from the study group, and the remaining subjects move on to the second stage, during which they are randomly assigned to an active or control/placebo arm. The rationale for this practice is that, given the variable effectiveness of many pharmaceuticals, enrichment produces a clearer indication of how well a drug works among the population who respond to it.
EERW advocates argue that traditional RCT designs may obscure the true efficacy of a drug, because positive results among drug 'responders' can be diluted both by poor or marginal results from 'non-responders', and by the placebo response, which is often considerable in pain trials. By screening out 'non-responders' in the first stage, EERW is supposed to identify drugs that work well in a subset of the population, even if their effectiveness across the entire population is negligible. EERW thus predicates the demonstration of efficacy upon the construction of patients as 'responders' and 'nonresponders'.
Non-response may occur because of multiple factors. Pain is an inherently subjective phenomenon, which presents difficulties in measuring drug efficacy in a standardized manner and defining improvement outcomes. Various metrics have been employed for defining and characterizing analgesic response. Some attempt to capture qualitative, affective and temporal dimensions of pain (that is, current, recalled or durational pain), or measures of physical function (Turk et al, 2003;Dworkin et al, 2005b;Katz, 2005;Turk et al, 2008b;Moore et al, 2013). Measures of pain intensity such as Numerical Rating Scales and Visual Analog Scales, however, remain the most widely employed (Arthritis Research UK, 2013;Smith et al, 2015). In addition, heterogeneity of underlying pain mechanisms, genetic variations in drug metabolism, and aspects of trial design may all contribute to 'non-response' (Dworkin et al, 2011b).
By limiting the placebo-controlled phase to a smaller subset of responders, enriched trials are promoted as having improved assay sensitivity (the capacity of a trial to distinguish the efficacy of two different interventions, or between active treatment and placebo) while requiring fewer participants, and thus are easier and less costly to conduct. The run-in or maintenance period used to identify drug responders typically involves titrating patients to a stable and tolerable dose that provides a mean improvement of 30 per cent or greater. About 30 per cent improvement has been proposed as the cutoff for a "clinically meaningful" response (Farrar et al, 2001;Younger et al, 2009), although this may vary from trial to trial (Farrar et al, 2001;Hewitt et al, 2011;Arthritis Research UK, 2013). So-called 'nonresponders' may include subjects who experience a lack of efficacy, fail to achieve a stabilized dose, or have too many or too serious adverse events, but patients may also be removed from the trial for protocol violation, withdrawal of consent, or other considerations (Hale et al, 2007;Katz et al, 2007;Hewitt et al, 2011). Although EERW studies provide less information about non-responders, proponents suggest that this shortcoming is outweighed by the benefits of improved assay sensitivity. This streamlined, "optimized" approach purportedly facilitates the testing of drugs that might otherwise require impractically large trials to demonstrate efficacy (Temple, 2010). A related approach, randomized discontinuation design, has also been employed in psychopharmaceuticals research, where similar issues "Unsettling circularity" of heterogeneity and high placebo response complicate evaluations of drug efficacy (Lakoff, 2007;Leon, 2011).
Contrary to the news reports cited above, enriched enrollment has a history that predates IMMPACT. While the terms 'enriched enrollment' and EERW were coined relatively recently, the design of an uncontrolled run-in period followed by randomization of responders is typically attributed to a paper written in 1975 by a pair of Janssen Pharmaceuticals employees. Amery and Dony's (1975) "A Clinical Trial Design Avoiding Undue Placebo Treatment" proposes a strategy to reduce methodological issues posed by placebo response, while addressing the ethical problem of exposing patients with chronic conditions to therapeutically inactive treatments. By removing all patients who did not experience an improvement after a short run in period with flexible dosing of the experimental drug, researchers could reduce the number and duration of patients' exposure to placebo. In the absence of standardized terminology for this approach, it is difficult to say with certainty to what extent the practice caught on, but it appears to have gained acceptance in some research areas, and was used in drug trials for arrhythmia, vasospastic angina, and in maintenance studies for antidepressants during the 1970s and 1980s (Roden et al, 1980;McQuay et al, 2008;FDA, 2012c).
In 1994 Robert Temple, Deputy Center Director for Clinical Science of the FDA's Center for Drug Evaluation and Research (CDER) and an authority on trial design (Carpenter, 2010), published a paper on "special study designs", advocating this open-label-followed-byrandomization design primarily as a means for grappling with the potential heterogeneity of "poorly understood" conditions: Where diseases are poorly understood it is entirely possible that they also contain distinct, but unrecognized, subsets of patients who respond differently to treatment. Where the non-responders are the minority, they will not interfere greatly with the ability of a study to show a treatment effect. But if the non-responders are the majority, they may interfere considerably, perhaps making a showing of effectiveness a practical impossibility. In this situation, it may be possible to identify a study population of potential responders not on any a priori basis but by using a clinical screening approach to select the participants in a subsequent randomized blinded trial. (Temple, 1994, p. 516) Temple's paper is one of the first published uses of the term enrichment to describe such approaches to trial design (see also Davis et al, 1992), and identifies the vasospastic angina drug nifedipine as the first to be approved by the FDA on the basis of EERW clinical trials. The paper was not widely cited or discussed in the literature at the time, but in recent years the topic has seen renewed attention (Arthritis Research UK, 2013). In 2012, the FDA released its Draft Guidance for Industry on Enrichment Strategies for Clinical Trials to Support Approval of Human Drugs and Biologicals (hereafter GFI), which encouraged EERW and biomarkerdriven targeting of patient subpopulations (FDA, 2012c). The GFI, although a non-binding draft document, is the most visible, high-profile and coherent articulation of enriched enrollment thinking, and represents an official endorsement by the FDA of enriched trials for regulatory approval (as opposed to preliminary or proof-of-concept trials). This position has been reiterated in subsequent publications (Temple, 2012(Temple, , 2013.

IMMPACT and ACTTION: Streamlining and standardizing the pain drug pipeline
While the GFI supports the use of EERW design for trials of drugs and biologicals across an array of conditions, EERW has received the most attention in the field of analgesics: a PubMed search conducted 22 February 2015 for the terms "enriched enrollment," "enriched enrolment," or "EERW" returned 33 records, 31 of which were for pain therapy trials. A 2013 review conducted by a UK-based arthritis research organization identified 57 published trials that had used EERW designs for analgesics in chronic non-cancer pain, 51 of which had been published since 1999 (Arthritis Research UK, 2013). Nevertheless, a recent paper on alternative trial designs for chronic pain found that EERW is "not generally accepted, nor much understood". The authors also noted that there have been "few good examples" of EERW and that "neither classic nor EERW trial designs have shown benefit for opioids in chronic non-cancer pain" (Moore et al, 2013, p. 44).
In 2010 the FDA announced the formation of a public-private partnership dedicated to "streamlin[ing] the discovery and development process for new analgesic drug products" (FDA, 2010). The Analgesic, Anesthetic, and Addiction Clinical Trials Translations Innovations Opportunities and Networks Initiative, or ACTTION (previously the Analgesic Clinical Trial Innovations Opportunities and Networks, or ACTION) brought together multiple stakeholders, including pain experts and professional associations such as the International Association for the Study of Pain, American Pain Society and the American Academy of Pain Medicine, pharmaceutical and biologicals companies, academics, patient advocates and Federal agencies, to develop knowledge infrastructures and coordinate collaborative research. The project is led by Robert Dworkin and Dennis Turk, who are also the founders and cochairs of IMMPACT. In 2011, IMMPACT was absorbed under the administrative umbrella of ACTTION, although the two organizations continue to publish independently.
The FDA webpage and accompanying press release for ACTTION lament that recent advances in the neurobiology and physiopathology of pain have not been accompanied by improvement in available drug therapies. In the face of recent evidence suggesting that it is increasingly difficult to demonstrate efficacy in RCTs for painkillers, particularly opioids, ACTTION proposes the rethinking of clinical trial methodology: Many experts in analgesic drug development believe that it is the design of the clinical trials that is at fault in this situation, and that better trial designs will yield more successful results. This hypothesis is certainly supported by the frequent failures of clinical efficacy trials of opioid drug products, considering the well-established effectiveness of these products from literally thousands of years of clinical experience. (FDA, 2010) In suggesting that the problem for analgesic drug development lies not with the drugs themselves, but with the methodologies employed to test them, ACTTION mobilizes an argument made in corners of the pain research world during the preceding decade: that RCTs may not be well-suited to accommodate the complexity of the chronic pain experience. Multiple conditions, ranging from lower back pain to osteoarthritis to post-herpetic neuralgia and fibromyalgia are included under the rubric of chronic non-cancer pain, and entail diverse approaches to etiology and treatment. Clinical trials on analgesics for these conditions have "Unsettling circularity" long struggled with high placebo response rates and only modest improvements for patients, with a roughly 30 per cent reduction in pain for one-third of patients accepted as the industry standard for a successful treatment (FDA, 2012a. Considerable diversity of opinion exists over how best to capture the chronic pain experience and define improvement. Some researchers have argued that the use of average pain scores in RCTs is deceptive, noting that the distribution of effect for pain tends not to follow a normal distribution. Rather than the usual bell-shaped curve, results from pain trials often produce a U-shaped curve, with higher numbers of patients experiencing either significant improvement or little to no improvement, and fewer falling close to the average (McQuay et al, 2008;Moore et al, 2010a, b;Moore, 2013). Although pain specialists often reported great success with opioids, these clinical results were not always easily reproduced in RCTs, and pain management has proven difficult to adapt to the particular demands of EBM. Making matters worse, researchers were frequently unable to replicate positive results from previous trials. Given this failure to replicate previous findings where "there [was] every reason to believe that the drug is truly efficacious" (Katz, 2005, p. s32), for some in the field it seemed to follow that there was something in the design and execution of RCTs that was interfering with the production of the expected results (Dworkin et al, 2005a;Katz, 2005;Katz et al, 2008).
The 'failure' of clinical trials was not limited to pain RCTs. Since the early 2000s, the FDA had been concerned about an apparent slackening of the drug pipeline across the board, and thus supported various plans for 'modernization' and 'optimization' of both clinical trial designs and regulatory protocol (Woodcock and Woosley, 2008;Cambrosio et al, 2009). In 2004, the FDA launched the Critical Path Initiative to identify "development gaps" that were contributing to what was seen as a productivity crisis in the pharmaceutical industry (Woodcock and Woosley, 2008). One issue was that RCTs were seen as too cumbersome, time-consuming and expensive, and this was interfering with the translation of new treatments to market. As part of its goal of improving translational pathways, the Critical Path Initiative encouraged innovative approaches to trial design, including the development of exploratory Phase 0 trials for biomarker validationwhich would be central to the emerging research program of personalized medicineas well as the formation of public-private partnerships such as ACTTION that would bring members of industry, academia and administrative agencies into closer contact (Dworkin and Turk, 2011).
In this spirit, and at roughly the same time, Dennis Turk and Robert Dworkin, two pain specialists at the University of Rochester and University of Washington, formed IMMPACT. Both Turk and Dworkin were well-established in the pain world, as founding members and past presidents of the American Pain Society, and authors of hundreds of publications on pain science and treatment. IMMPACT was conceptualized as a series of meetings involving industry, academia and the FDA to "develop consensus reviews and recommendations for improving the design, execution and interpretation of clinical trials for pain" (IMMPACT, 2002). IMMPACT has published recommendations on patient-reported outcomes (Turk et al, 2006), assay sensitivity (Dworkin et al, 2012), abuse-deterrence (Dworkin et al, 2012), multiple endpoints and missing data analysis (Turk et al, 2008a;Gewandter et al, 2014b), disclosure of authorship contributions (Hunsinger et al, 2014) and interpretive "spin" (Gewandter et al, 2015). Together, these articles argue for the necessity of widely agreedupon standard forms of recording and reporting data from clinical trials, as well as on alternative design strategies such as enrichment.

Enrichment had been a focus of IMMPACT since its 2006 meeting on "Randomized
Clinical Trials for Chronic Pain Treatments: Placebo Controlled-Designs and their Alternatives" (IMMPACT, 2006), and in 2010 the organization published its recommendations on the value of enriched enrollment specifically for pain trials (Dworkin et al, 2010). While IMMPACT is a private organization and not a public-private partnership, FDA officials such as Bob Rappaport, then director of the CDER's Division of Anesthesia, Analgesia and Addiction Products, sometimes appear as co-authors on IMMPACT publications. FDA representatives attend the meetings in a consulting capacity alone, and the recommendations are not intended to represent official FDA policy.
The IMMPACT and ACTTION initiatives frame their mission explicitly as working toward the development of evidence-based clinical trial practice for pain therapeutics (Dworkin et al, 2011b;Fillingim et al, 2014). It is thus important to recognize that the current promotion of EERW is framed not solely in terms of cost containment and the streamlining of regulatory approval, but as part of a larger movement toward standardization and coordination of research.

Naturalizing Enrichment: The Authority of the Old and The Promise of the New
The GFI is the most coherent articulation of a broader discourse on the value and promise of enrichment (Katz, 2005;Katz et al, 2008;McQuay et al, 2008;Katz, 2009;Dworkin et al, 2010;Riordan and Murphy, 2010;Temple, 2010;Dworkin et al, 2012;Temple, 2012Temple, , 2013. The GFI defines enrichment as "the prospective use of any patient characteristic to select a study population in which detection of a drug effect (if one is in fact present) is more likely than it would be in an unselected population" (FDA, 2012c, p. 2), and divides enrichment strategies into three broad categories: practical, prognostic and predictive.
Practical enrichment encompasses a variety of practices already used in clinical trials, such as selecting a study population from those individuals who suffer from the specific condition for which the therapy is intended, rather than a de novo random sampling of the population. This population is further 'enriched' by selecting for patients who are not likely to spontaneously improve, who will comply with the treatment regimen or who fall within a baseline range of relevant measures (FDA, 2012c). Also referred to as "strategies to decrease heterogeneity" of a study population, thereby reducing noise, practical enrichment aims to better detect a drug's efficacy among those most likely to receive or potentially benefit from the treatment. Prognostic enrichment refers to the selection of patients at a higher risk of a particular clinical endpoint that the treatment is intended to address, such as heart attack, stroke or tumor occurrence (Temple, 2010;FDA, 2012c). Both prognostic and practical enrichment are presented as a normal part of accepted clinical trial practice.
Enriched enrollment is more commonly understood as a form of what Temple and the GFI describe as predictive enrichment: the selection and enrollment of patients more likely to demonstrate a treatment effect, if one exists (FDA, 2012c). A review of the literature indicates that "enrichment" is a term predominantly reserved for studies that employ a two-stage, openlabel followed by randomization of responders design (McQuay et al, 2008;Katz, 2009;Quessy, 2010). The GFI divides predictive enrichment into biomarker-based and empiric strategies. A biomarker is defined as "a characteristic that is objectively measured and evaluated as an indicator of normal biologic processes, pathological processes or pharmacological responses to a therapeutic intervention" (FDA, 2014b). Biomarker-based enrichment strategies involve the identification of a subsample of the patient population that shares some underlying attributea proteomic or genetic marker, a particular metabolic response to the test drug, or a phenotypic or genomic marker that predicts response even if its exact pathophysiological significance is unclear. Biomarker identification has become a major research endeavor, upon which depend the hopes of a new wave of 'precision medicine,' and 'personalized' or 'individualized' treatments, most visibly in oncology (Davis and Mitchell, 2012;FDA, 2013a;United State Department of Health and Human Services, 2015). Empiric enrichment strategies, on the other hand, do not rely on a previously identified pathophysiological target or biomarker, but use previous studies, screening, or lead-in periods to identify populations of responders. 2 EERW is considered an empiric strategy, as researchers do not have a clear idea of who the responders will be until after the initial phase of open-label treatment.
The GFI employs two main rhetorical strategies to normalize and promote the use of EERW and of enrichment. First, by recasting routine aspects of trial design (such as excluding poor compliers and patients who cannot provide stable baseline measurements of important variables) as rudimentary forms of enrichment, the GFI establishes EERW as consistent and continuous with established practice. EERW is thus presented not as a challenge to the authority of the RCT (as might be presupposed by the 'failed trials' claim), but rather as one form of a widely used and accepted practice in clinical research. According to this account, all trials are enriched trials, to at least some extent, because they always employ some form of selection from the general population (Temple, 1994;Katz, 2005). EERW is "not really a novel idea" (Temple, 2013), it simply makes explicit what is already an implicit, ethically and methodologically uncontroversial component of all clinical trials.
Second, by subsuming biomarker-driven and molecularly targeted therapies under the enrichment umbrella, proponents are able to link EERW to the much-vaunted "promise of personalized medicine", evoking a future where clinicians are able to prospectively identify high-risk or high-response patients and tailor therapies accordingly (FDA, 2013a). The GFI devotes considerably more space to proteomic and genomic strategies than to empiric strategies such as EERW, and this rhetoric of personalization features prominently in the wider enrichment literature, even as proponents acknowledge that genetic, phenotypic and pharmacogenomic approaches to pain medicine remain largely speculative (Hedgecoe, 2004;Dworkin et al, 2011b;Dworkin et al, 2014). The juxtaposition of EERW and biomarker-based approaches under the common heading of predictive enrichment produces the effect of a shared justification and a shared potential, in spite of very divergent methodologies. Meanwhile, the concatenation of EERW with more mundane practices allows proponents of enrichment to mobilize both the familiarity of the old and the excitement of the new.

Efficacy Versus Effectiveness?
In theory, EERW studies could serve as a preliminary step toward the identification and molecular characterization of drug responders, by employing an investigational follow-up to determine whether an underlying shared trait could be identified to account for similarities in response. This is not, however, included as part of the EERW approach as promoted. In practice, biomarkers remain more an exciting frontier for chronic pain research than an actionable basis for drug development. Molecular targets for pain have yet to be validated, and they play no role in clinical practice. To date, drug manufactures do not appear to be attempting to bridge this gulf between enriched trials and personalized therapy, nor has the FDA made characterization of drug responders a condition of, or a priority for, the acceptance of EERW trials for regulatory approval. The GFI states: In contrast [to biomarker-driven studies], some empiric strategies that provide predictive enrichment (e.g., studying known responders in a conventional study or in a randomized withdrawal study) can efficiently establish the effectiveness of a drug in a subset of the population, but provide no way for prescribers to prospectively identify patients with a greater likelihood of response, or predict the magnitude of response in an unselected patient. Although this type of untargeted treatment may seem troubling (treatment of many to attain a response in only some), the reality is that this is generally the case with treatments that are approved on the basis of conventional studies in a nonenriched population, where there is typically a wide range of responses, including no effect at all, or even harm in some cases. . . . In general, then, FDA is prepared to approve drugs studied primarily or even solely in enriched populations and will seek to ensure truthful labeling that does not overstate either the likelihood of a response or the predictiveness of the enrichment factor.
(FDA, 2012c, p. 31, italics added) The FDA reasons that if a drug has been proven efficacious for some trial participants, the drug should be made available, even if there is no way in a real-world clinical setting to identify in advance which patients will respond. Consequently, drugs approved on the basis of studies in an enriched population do not include narrower prescribing information in the product labeling. Most drugs, the GFI points out, are approved for a general indication even though there is considerable variability in response; at least EERW designs approach the issue of heterogeneity of response directly, and build it into the analysis. The acknowledgment that this kind of approach "may seem troubling" touches on a tensioncommon in the design and interpretation of RCTsbetween internal and external validity, or the extent to which the results of a well-designed randomized clinical trial may be extrapolated to clinical practice for the general population.
Internal validity is often expressed in terms of assay sensitivityhow well a trial is able to demonstrate that a change in one variable (the disease condition) is dependent on a change in the other (the experimental drug). RCTs generally strive to maximize internal validity, sometimes at the expense of external validity (Cartwright, 2007). Design decisions concentrate on identifying the efficacy of a drug in a highly controlled and artificial experimental context, using a carefully selected group of patients, rather than on the generalizability of these results to the wider clinical population (sometimes framed as efficacy versus effectiveness, although the distinction is not universally employed (Gartlehner et al, 2006;Rothwell, 2006)).

"Unsettling circularity"
Although the RCT is widely accepted as the evidentiary gold standard, this apparent privileging of internal validity over general applicability of results remains a point of contention (Epstein, 2008;Marks, 2009).
Advocates of EERW have addressed this issue in different ways. Some researchers have focused on the value of EERW designs for proof-of-concept trials, and suggested that EERW actually more closely resembles clinical practice because in an enriched study one does not continue exposing a patient to a non-tolerated or ineffective therapy, just as a physician does not typically continue prescribing a drug that does not work (Hale et al, 2007;McQuay et al, 2008;Katz, 2009). Indeed, Amery and Dony's 1975 paper suggests that their strategy is "precisely similar" to clinical practice (Amery and Dony, 1975).
For the most part, however, the discourse on EERW that has emerged from the FDA, IMMPACT and related publications focuses on improved assay sensitivity (Katz, 2005;Dworkin et al, 2010;Riordan and Murphy, 2010;Woolf, 2010;Hewitt et al, 2011;Dworkin et al, 2012;FDA, 2012c;Gewandter et al, 2014a). Proponents acknowledge that the design sacrifices external validity for improved internal validity, but tend to treat the loss of external validity as a non-issue, because (in theory at least) the drug tested in an enriched trial is only intended for use in that small subset of the population for which it is effective (McQuay et al, 2008;Katz, 2009). 3 Moreover, proponents argue that EERW and conventional trials confront similar questions regarding generalizability because of the heterogeneity of the patient population and the use of average pain scores to measure efficacy in non-enriched trials, but in conventional trials the actual efficacy for the responder subgroup is muddied by the surrounding non-responders (Katz, 2005;FDA, 2012c). Unlike EERW, conventional clinical trial designs thus run the risk of missing vital information about efficacy, and abandoning drugs that could benefit a subgroup of patients.
Although some criticism of enriched enrollment has emerged from the pain field, it has been fairly piecemeal and no identifiable opposition to the newly FDA-endorsed movement appears to exist. Interestingly, the little endogenous critical literature on enriched enrollment takes issue not with its limited external validity, but with the claims to improved internal validity. Several authors have noted that the use of a run-in phase prior to randomization, which familiarizes all participants with a drug's effects, presents a risk for unblinding (Leber and Davis, 1998;Staud and Price, 2008a, b;Furlan et al, 2011;Hall and Kaptchuk, 2013). In addition, a recent meta-analysis comparing enriched with non-enriched trials for opioids in chronic non-cancer pain found that contrary to its rationale, EERW studies did not appear to produce stronger measures of efficacy, but tended to underestimate adverse events, which, as we will explore below, has a particular salience for studies of chronic pain treatments (Furlan et al, 2011;Quessy, 2011).

Destabilization and the Rhetoric of RCT 'Failure'
If the above may be viewed as an attempt to normalize enrichment by positioning it as both a continuation of routine RCT conduct and a waypoint on the road to personalized medicine, a third rhetorical strategy mobilizing the claim of RCT 'failure' may be seen as one of destabilization, producing an epistemological crisis for which EERW offers a solution. Amery and Dony's 1975 paper framed enrichment as a response to an ethical dilemma: whether it was morally acceptable to expose individuals to therapeutically inactive treatments. Since then, use of placebos in clinical trials has diminished, as ethicists and others have placed a greater emphasis on clinical equipoise. Currently, EERW is framed not as a response to an ethical dilemma, but rather to methodological problems regarding assay sensitivity and a concern about false negative results in RCTs for pain medications (Dworkin et al, 2011a;Dworkin et al, 2011b).
Recent literature on enrichment makes the observation that many recent trials of analgesics for chronic pain have failed to demonstrate efficacy, sometimes contradicting previous evidence suggesting the drugs' effectiveness (Dworkin et al, 2010;Riordan and Murphy, 2010;Dworkin et al, 2011b). Conventionally, such negative study results could be interpreted as evidence that the treatments in question are not as effective as had previously been believed. In contrast, enrichment advocates suggest that the problem may lie not with earlier evidence, or with the drugs, but rather with the design of the trials themselves. In this context, 'failure' denotes the inability of a trial to confirm a prior conviction that a given drug works: However, opioids are widely considered to be effective for a variety of different types of chronic pain, based on dozens of published RCTs, many observational studies, and millennia of clinical experience. It is therefore instructive to assume efficacy and review the trials to see whether certain methodological features are more likely to be associated with a positive trial than others. (Katz, 2005, p. s38, italics added) Assuming the validity of previous evidence of efficacy and the comparability of the patients and outcome measures in these studies, such results may be a consequence of limitations in the ability of these RCTs to demonstrate the benefits of efficacious analgesic treatments versus placebo ("assay sensitivity"). (Dworkin et al, 2012(Dworkin et al, , p. 1149 In contrast to criticism of the pharmaceutical industry that focuses on the problem of false positives (for example, trials that erroneously suggest that a drug is effective), this strategy emphasizes the danger of false negatives (for example, trials that erroneously suggest that a drug is ineffective) (Dworkin and Turk, 2011;Dworkin et al, 2013). By assuming efficacy for test drugs, negative trials are recast as failed or broken trials, and the implied risks of 'false negative' results (that potentially useful drugs will be abandoned) are mobilized to justify the reassessment of clinical trial design for pain medications.
It is beyond the scope of this article to determine whether older RCTs, newer RCTs, or clinical experience are most likely to produce accurate assessments of the actual effectiveness of these drugs. The randomized clinical trial has come to be considered the gold standard for information about drug safety and effectiveness, and is a key knowledge-producing technology in the hierarchy of EBM. This pride of place was hard-won. Since the mid-twentieth century, advocates campaigned for the privileging of RCTs over observational studies and clinical experience by, in Harry Marks' words, "taking an unfamiliar and sometimes disturbing set of practicesrandom allocation, professions of ignorance, etc.and presenting "Unsettling circularity" them as if they were a natural extension of traditional research" (Marks, 2009, p. 88). In this way, the RCT, like EERW, represented a critique of the evidentiary status quo, posed in terms of continuity rather than rupture. However, it is worth noting that the rhetoric of RCT failure employed by EERW's proponents inverts the hierarchy of evidence at the heart of EBM. Whereas EBM privileges systematic reviews, meta-analyses and recent, well-designed RCTs as 'gold standards' of evidence, the rhetoric of failure appeals instead to the very forms of evidence that EBM is supposed to supplantolder studies, uncontrolled observational research, expert opinion and "millennia of clinical experience" (Katz, 2005;FDA, 2010).
This definition of failure reconfigures the conventional logic of clinical trials in a manner similar to personalized medicine: rather than relying upon the epistemological priority of standardized trial methodology to evaluate a drug's success or failure for a general population, the trial methodology is tweaked to focus on specific populations that are already assumed to benefit from a drug. In personalized medicine, however, this depends upon the availability of biomarkers, molecularly targeted agents, or pre-specified patient populations, none of which yet exist for chronic pain. The approach has been described as finding "the right patients for the drug" (Lakoff, 2007). In this reformulation, it is the drugnot the factmaking machinery of the trial, nor the patient, nor the "poorly understood" diseasethat serves as the stable reference point for the investigation. The drug becomes the test of the trial, rather than the trial serving as the test of the drug.

Heterogeneity, Labeling, and the 'Empty Promise' of Personalization
As the current standard-bearer for molecularly-targeted therapy, oncology is the field to which the literature on analgesic trial enrichment typically refers in the attempt to associate EERW with personalized medicine (Davis and Mitchell, 2012). Medical oncology has emerged as an area of considerable experimentation with clinical trial design and execution, often under the banner of 'enrichment' Kohli-Laven et al, 2011;American Society of Clinical Oncology, 2015;Catenacci, 2015). Indeed, an approach similar to EERWrandomized discontinuation designhas been employed in cancer research for many years, but only as part of the biomarker validation process (mostly in phase II trials for cytostatic agents) (Kopec et al, 1993;Freidlin and Simon, 2005;Galsky et al, 2010). However, this proposed continuity obscures a fundamental dissimilarity between pain medicine and oncology, perhaps between chronic pain and cancer as medical entities. Pharmacogenomics, molecularly targeted agents and companion diagnostics may be said to produce personalized medicine as an advanced form of kind-making (Hacking, 1999). 4 Cancers that were previously approached as relatively discrete, uniform entities have been disaggregated by new understandings of cancer genomics and the development of targeted drugs that work only for patients with particular genetic signatures, which become the basis of new configurations and sub-classifications of disease (Keating and Cambrosio, 2007;United States National Research Council, 2011). This type of investigational research presumes that such a deeper categorization exists, organizing patterns of response and non-response to actual and as-yetundiscovered therapeutic agents (Lakoff, 2008), a presumption that has been more or less borne out in oncology research.
Temple's justification for enrichment reflects a similar logic: "Where diseases are poorly understood it is entirely possible that they also contain distinct, but unrecognized, subsets of patients who respond differently to treatment" (Temple 1994, p. 516). It is thus assumed that the heterogeneity of response to pain drugs (on which is blamed lowered measures of efficacy) is explained by the existence of some as-yet-unidentified kinds, whether constituted pharmacogenomically (on the basis of response to drugs) or pathophysiologically (on the basis of underlying pain mechanism). Enrichment strategies are thus posited as a means to penetrate the opaque heterogeneity of chronic pain and allow for the elaboration of these kinds, with generalizability sacrificed in the name of improved precision. However, it is neither obvious nor inevitable that chronic pain is in this way analogous to cancer. The search for mechanistic explanations of heterogeneity in pain conditions and analgesic response has been central to the pain research endeavor since well before the genomic turn (Baron, 2006;Woodcock et al, 2007;Borsook et al, 2011;Borsook and Kalso, 2013). However, short of molecular subtyping, even the more pragmatic goal of using clinically observable differences as bases for phenotypic subtyping has yet to be successfully realized (Attall et al, 2011;Baron et al, 2012;Dworkin et al., 2014). As Dworkin et al have acknowledged: Unfortunately, to our knowledge, there are no replicated examples of the successful prediction of patient responses to analgesic treatments. If patients do not differ systematically in their "true" responses to a given treatment, there would be limited ability of either patient genotype or phenotype to predict that response. (Dworkin et al, 2014, p. 49) Meanwhile, the empiric enrichment of EERW does nothing to elucidate this heterogeneity, nor even to substantiate the supposition that this kind of patterned typology exists beneath the noisethat is, that those identified as 'responders' within the context of a clinical trial actually do constitute a distinct subpopulation, a 'kind' of pain sufferer for whom there is a molecularly appropriate treatment. Indeed, even as it provides the justification for pursuing alternative trial designs such as enrichment, the distinct subtype explanation for heterogeneity of response remains largely an untested hypothesis (Senn, 2004;Dworkin et al, 2014). Ultimately, if this distinct subpopulation does not existor cannot be identifiedoutside of the confines of a clinical trial, then the division of patients into 'responders' and 'nonresponders' may be premature, even artifactual.
The limitations of EERW's approach to unraveling the heterogeneity of chronic pain are reflected in drug labeling. The GFI suggests that "enrichment strategies should be clearly described to indicate how the drug is to be used and to whom the results might apply (groups of patients that do and do not benefit)" (FDA, 2012c, p. 31), but later states that the agency is willing to approve drugs on the basis of studies that provide no way to prospectively identify the appropriate target population. The FDA's stated commitment to ensure "truthful labeling" notwithstanding, there has yet to be a major revision of the approach to analgesics labeling. Although some pain drugs are approved for specific conditions such as osteoarthritis or painful diabetic neuropathy, typically the indication is defined in terms of severity and "Unsettling circularity" duration (for example, for moderate-to-severe chronic pain where around-the-clock treatment is required for an extended period of time). 5 Of the five drugs recently approved for chronic pain on the basis of EERW studies -Zohydro, Ultram ER, Opana ER, Embeda and Lyricaall but one bear a general indication which does not reflect the selective enrichment of the patient population (Angst, 2013). Pfizer's Lyrica is specifically indicated for neuropathic pain associated with diabetic peripheral neuropathy, post-herpetic neuralgia, partial onset seizures and fibromyalgia (FDA, 2014d), but these indications reflect traditional diagnostic classifications, not the supposed heterogeneity of response that may exist within them, or any underlying mechanism.
While such considerations represent a critical uncertainty at the heart of all attempts at nosological refinement, the stakes for analgesics development areas we shall examine in the following sectionparticularly high, which may account for the public controversy surrounding IMMPACT, EERW and the Zohydro approval. If there is no underlying order by which the heterogeneity of response may be organized, then the privileging of internal validity over external validity becomes clinically meaningless, and given the risks of addiction, diversion and misuse associated with opioids, potentially quite dangerous. Proponents of EERW have brushed off the suggestion of an "unsettling circularity" to the design, arguing that drugs studied in a selected population are only intended to be prescribed for such a subset of the population (Katz, 2005;Dworkin et al, 2010). However, the inability of these trial results to prospectively identify responsive patients, and of the FDA to ensure selective labelingin addition to the fact that the FDA has been approving drugs based on enriched trials for decadesgives lie to these assurances. Although there are no data available on the total number of analgesics the FDA has approved on the basis of EERW trials, if the five drugs referenced above are representative, there is little in the analgesic drug labeling to alert patients and prescribers that the drugs were tested in enriched study populations.
The increased precision entailed by personalized medicine seems to run counter to the conventions of drug marketing. Historically drug manufacturers have sought the widest possible labeling, with multiple and inclusive indications, in order to increase prescriptions and expand their markets (Hedgecoe, 2004;Greene, 2007;IoM Committee on Advancing Pain Research Care and Education, 2011), which has in turn led to criticism of 'diagnosis creep' or 'disease mongering' (Moynihan et al, 2002). Some commentators have suggested that measures might need to be taken to incentivize manufacturers to take on the development of what would effectively be orphan drugs, useful to only a small percentage of patients (Woodcock et al, 2007;Woolf, 2010;IoM Committee on Advancing Pain Research Care and Education, 2011). The current position of the FDA on enriched enrollment appears at least for the time being to moot this problem by authorizing general indications for drugs that have only been tested in enriched populations. Although the GFI suggests that studies should be conducted in marker-negative populations to ensure that a predictive biomarker is wellvalidated and relevant data from non-responders is not overlooked, it fails to place a similar emphasis on those non-responders excluded from empirically enriched trials using EERW designs (FDA, 2012c). As such, even as EERW is being promoted as a means of elucidating the opaque heterogeneity of chronic pain, it nevertheless reproduces an uncertainty about the appropriateness of certain drugs for (un)certain disease conditions that it was supposed to resolve (McGoey, 2009 Issues of heterogeneity of response, labeling and indication and the significance of negative trials results must be situated historically amidst the fraught evidentiary politics of pain medicine in the twenty-first century. Critiques of EBM and RCTs as highly artificial and constrained modes of knowledge production are multiple and familiar (Wahlberg and McGoey, 2007;Epstein, 2008;Will and Moreira, 2010). Indeed, it is worth acknowledging that many of the issues presented by chronic pain research illustrate deeper tensions surrounding the authority and interpretation of clinical trials that have persisted since the mid twentieth century (Marks, 1997(Marks, , 2009DeHue, 2010). However, that the challenge presented by the EERW rhetoric is produced under the auspices of an overtly evidence-based initiative is noteworthy. Pain is a notoriously subjective experience, and for scientists and clinicians seeking the objective measures preferred by EBM, chronic pain has proven particularly intractable, because of its inherent subjectivity and the relative inaccessibility of the brain and nervous system (Borsook and Kalso, 2013).
Medical interest in chronic pain has increased since the late 1980s, stimulated by a number of factors, including patient activism, professional organizing and legislative action (see the designation of pain as the 'fifth vital sign' and the passage of the Pain Relief Promotion Act (US Government, 2000)), as well as the development and prescription of opioid analgesics such as OxyContin for chronic non-cancer pain. Despite this growing interest, many aspects of chronic pain remain unclear, including basic questions about whether it is better approached as a symptom of some underlying pathology, or as a disease in itself (IoM Committee on Advancing Pain Research Care and Education, 2011). Researchers remain dependent on subjective patient self-reporting for pain measurement and even standardized methods for the clinical evaluation of pain rely on patient-reported numerical variables, facial expression scales or qualitative descriptions that range from "no pain at all" to "worst pain imaginable" (Hewitt et al, 2011). A significant portion of IMMPACT's and ACTTION's publications have been devoted to rationalizing and standardizing measurement, rating scales and outcomes in pain trials (Turk et al, 2003;Dworkin et al, 2005b;Turk et al, 2006;Farrar et al, 2014;Fillingim et al, 2014;Smith et al, 2015). Even as the burgeoning sciences of biomarkers and genomics have invigorated the search for a mechanism-based understanding of chronic pain (Kim et al, 2009;Diatchenko et al, 2013), pain scientists recognize that the fruits of such inquiry will still need to be coordinated with, rather than superordinate to, environmental and psychosocial aspects of the pain experience (Borsook and Kalso, 2013;Fillingim et al, 2014). While the fundamentally subjective nature of pain presents challenges for clinicians and researchers alike, it also opens up a space for critique of conventional clinical trial designs that require reliable, uncontroversial metrics of drug efficacy (Tousignant, 2011).

"Unsettling circularity"
Pain also presents special challenges because the kinds of drugs that are used to treat itparticularly prescription opioidspresent risks of serious adverse events such as non-medical use, abuse, addiction and overdose among both patients and the general population. Until the early 1990s, opioids were considered too dangerous for chronic pain management, and were prescribed almost exclusively for cancer pain and palliative care. In the past two decades, prescriptions for opioids for chronic non-cancer pain have more than quadrupled, and mortality rates for unintentional prescription drug overdoses have risen steeply in the United States and Canada. Deaths involving opioid analgesics, including hydrocodone, oxycodone, hydromorphone and methadone, have surpassed deaths from heroin and cocaine combined, and the "prescription opioid epidemic" is now widely recognized as a major public health problem (US Congress, 2012. In 2010, the 11th consecutive year in which drug overdose deaths increased, 75 per cent of all pharmaceutical overdose deaths involved opioids, and prescription opioids were involved in 16 651 deaths in the United States (King et al, 2014). While the exact causes of these increases are complex, they are strongly associated with a general increase in prescription of opioid analgesics, particularly stronger opioids such as OxyContin and methadone, and among patients with chronic non-cancer pain. In addition, a substantial proportion of opioid analgesics are 'diverted', or used by people other than those receiving the original prescription, either recreationally or therapeutically (US Congress, 2005Congress, , 2012. The high level of mortality, coupled with the prevalence of diversion and nonmedical use, raises the public health stakes of granting wide approval for a drug that is known to be effective only for a small percentage of patients. In 2007 top executives of Purdue, the makers of OxyContin, pled guilty to fraudulent marketing involving unsubstantiated claims about the safety and effectiveness of the drug, which had become widely prescribed for chronic pain in the years since its approval. More recently, it has become clear that prominent members of the pain management community promoted the use of opioids for chronic pain in the absence of any strong evidence for their safety and effectiveness (Chou et al, 2009;Catan and Perez, 2012;Meier, 2012). While regulators and law enforcement have since the early 2000s been keenly aware of the dangers of opioid diversion and misuse, in recent years the safety of these drugs even for the intended patient population has come under scrutiny. Several studies and review articles have raised questions about the appropriateness of these drugs for a variety of conditions, not least for the long-term treatment of chronic non-cancer pain (Noble et al, 2008;Chou et al, 2009;Kissin, 2013;McNicol et al, 2013;Sehgal et al, 2013;Chou et al, 2015). A recent NIH-conducted systematic review of the long-term use of opioids for chronic pain found insufficient to low-quality evidence in a number of domains, including effectiveness and comparative effectiveness, risk of adverse events, dosing strategies and instruments for the identification of higher-risk patients (Chou et al, 2015;National Institutes of Health, 2015). Some of the literature on enrichment reflects this re-evaluation, emphasizing the lack of novel analgesic therapies over the past 20 years and the need for less risky treatment options than opioids (FDA, 2010;Woolf, 2010;Dworkin et al, 2011b; IoM Committee on Advancing Pain Research Care and Education, 2011). While this may well reflect a more sensitive analysis of the complexity of pain as a research object and clinical reality, the most common example mobilized by advocates of EERW remains the failure of recent RCTs to demonstrate the effectiveness of opioids (Dworkin et al, 2005a;Katz, 2005;Katz et al, 2008;FDA, 2010). Against this backdrop, the 'assumption of efficacy' underpinning the re-evaluation of trial design may also be interpreted as a strategic reaction in the face of an increasingly problematized evidence base.
The case of Zohydro is illustrative. FDA approvals are guided by a process of internal scientific review, supplemented in some cases with review by an external advisory committee. The FDA opted to approve Zohydro in spite of the advisory committee voting 11-2 against approval, on the grounds that the drug's benefits outweighed the risks for the small number of patients for whom it would be appropriateideally those already on a hydrocodonecontaining drug who required higher doses than were currently available (FDA, 2012b). Although the Advisory Committee did not explicitly identify EERW as a problem, issues of generalizability, specificity and lack of data on non-respondersthe acknowledged limitations of enriched studieswere central to the concerns of dissenting voters. The advisory committee expressed concern that limiting the study sample to a small percentage of responders produced little information on how the drug would be received in the comparatively unselected population of "real-life clinical practice" (FDA, 2013b). The committee also commented on the lack of data presented by Zogenix (Zohydro's manufacturer and sponsor of the clinical trials presented to the FDA) that could contribute to the characterization the responders in the study group, or facilitate the identification of responders in the general population. One committee member emphasized that the indication sought for approvalmoderate-to-severe chronic painwas a generalized indication that gave prescribers no additional information or grounds for identifying the patients for which the drug would actually be effective, even though 41 per cent of those initially enrolled in the study were discontinued for poor tolerance or inadequate response, followed by an additional 18 per cent of those in the treatment arm of the study (Rauck et al, 2014;ClinicalTrials.gov, 2014).
Against the backdrop of increasing opioid-related mortality, Senator Manchinwhose home state, West Virginia, in 2012 had the highest drug poisoning death rate in the United States (Centers for Disease Control and Prevention, 2014)and others were quick to draw a connection between enriched enrollment and the approval of Zohydro, describing EERW as an "alarming explanation" for the FDA's decision. However, EERW is not a design that may be smuggled in covertly in order to satisfy a sponsor's desire to produce more compelling numbers, as the comments of the advisory committee demonstrate. The fact of EERW's acceptance and promotion by a range of actorsincluding pharmaceutical companies, regulatory scientists and academic researchersdistinguishes it from cases of surreptitious "engineering out" or "engineering up" of trial results (Petryna, 2007). The FDA GFI and the work of ACTTION / IMMPACT represent attempts to modify evidentiary practices well upstream of any particular clinical trial. It is beyond the purview of this article to speculate what caused the FDA to approve Zohydro against the recommendations of its advisory committee, but it could also be said that the FDA approved Zohydro in spite of EERW, in that many of the objections of the committee directly reflect the acknowledged limitations of EERW design. Temple (2012), the FDA's most vocal advocate of EERW, has argued that, "While enrichment won't save a drug that doesn't work, it will help find one that will." This statement, albeit intended as a pithy summation of a complex set of practices, is nevertheless an oversimplification in several important ways. FDA approval decisions are about more than whether or not a drug 'works', and entail calculation of the risks and benefits of the drug for "Unsettling circularity" its intended population, the availability of other treatments, and increasingly, in the case of analgesics, an assessment of the risks associated with diversion, misuse and abusewhat, taken together, could be called "clinical effectiveness" (Moore et al, 2010a). In the context of enrichment, not only do these considerations appear to be underserved, but 'finding a drug that works' comes to entail a reconfiguration of the very standards by which such an assessment is made. We have demonstrated that EERW represents a critique of the evidentiary status quo, posed in terms of both continuity and rupture. This continuity is, however, partial at best, and partly paradoxical: EERW is framed as a natural extension of the practices already "virtually universal" in trial practice, while the rhetoric of clinical trial failure depicts an epistemological crisis, calling into question the capacity of conventional RCT design to provide trustworthy evidence about drugs for chronic pain (FDA, 2013c).

Conclusion
In his analysis of the rhetoric of personalized medicine and "promissory science" (or "pharmacogenomic expectations") Hedgecoe cautions against dismissing such as-yetunsubstantiated claims as mere "hype" (Hedgecoe and Martin, 2003;Hedgecoe 2004). Instead, he highlights the ways in which these claims serve to organize resources, guide research, and shape the regulatory and economic environment of the future. The endorsement of EERW by the FDA seems to represent such a reshaping of the regulatory environment under the auspices of the promise of personalized medicine. Significant in this case is not that genetic markers for pain have not yet been discovered, but that EERW does not in itself provide means for their discovery, nor does the current guidance contribute to the prospects for individualization of treatment. It must be born in mind that the existing guidance is only a draft, and that future publications may differ significantly. Pharmaceutical companies have already responded to the GFI with requests for clarification on when EERW is considered appropriate and potential impacts on labeling (Regulations.gov, 2013). In the current regulatory context, however, EERW allows the streamlining of the drug development and approval process, in the absence of any contribution to a refined understanding of chronic pain or its treatment, reproducing the lacuna that troubles the heterogeneity of patient response. Despite attempts to address or preempt the apparent 'unsettling circularity' of EERW, the circle remains unbroken. In this respect EERW helps find the 'right patient for the drug' within the context of the clinical trial, but fails to help clinicians find the right drug for the patient, and so represents only an empty promise of personalization.
of medical knowledge and pharmaceuticals regulation, 'microbiopolitics', and human-microbe relations and wine. He is the author of Food & Trembling (Invisible Publishing, 2011). Nicholas B. King is an associate professor in the Biomedical Ethics Unit and Department of Social Studies of Medicine, and associate member of Department of Epidemiology, Biostatistics, and Occupational Health at McGill University. His research interests include the framing and interpretation of health information, public health ethics and policy, biosecurity, health inequalities and the social determinants of health.