Introduction

Health technology assessment (HTA) is a form of health care policy research that evaluates the impact of a health care technology (e.g., instrument, medication or procedure) on society (or a subset of society such as a payer or patient perspective). It incorporates a range of methods and approaches that are tailored to the needs of the health care system in various countries and considers various aspects of that impact such as efficacy, toxicity or negative outcomes, costs, and patient burden or quality of life [1]. Unlike many parts of the world that require HTA as part of the process for pricing, access, and reimbursement, it is not required in Japan or the United States (US) for regulatory approval or for reimbursement. Nevertheless, HTA is conducted and published by industry, academia, and other groups to compare the value of treatment options to inform decision making in these countries. In Japan, HTA is not yet a required evaluation process for approval, but there is evidence suggesting that this will be more important in the not-so-distant future [2, 3]. Japan has a national healthcare system in which patients pay a portion of the total costs, and the government or employer (via insurance) covers the remainder of the costs. The statutory health insurance system is administered by a multitude of insurers: the government (from October 2008, a quasi-governmental body, the Japan Health Insurance Association) for employees of small to medium-sized firms and their dependents, 1584 Society-managed Health Insurance funds for employees of large firms and their dependents, 76 Mutual Aid Society (MAS) funds for government employees and dependents, 1835 municipal National Health Insurance funds for the self-employed, retired and unemployed, and 166 National Health Insurance Society funds for some occupational groups such as doctors and lawyers, each with different premium contribution rates [4]. Japan relies heavily on hospital care; facilities support a large number of hospital beds, and patient care reflects a long average length of stay and high utilization rates compared with other regions of the world. National health expenditures in Japan have increased significantly since 1961 with the establishment of the National Health Insurance system for the entire population. The government has set a policy goal of controlling the growth of health expenditures in parallel with national economic growth to reduce the financial burden on the population. In 1990, at a time of economic boom, national health expenditure as a proportion of GDP was almost 5% [4]. Since then, economic growth has stagnated and, therefore, health care costs as a percentage of national income has increased steadily. Given the long-lasting economic recession, coupled with the rapidly aging population, the burden of health care costs is expected to grow in the foreseeable future [4].

In comparison to Japan, there are factors that reduce the urgency of the US government to take primary responsibility to curb health care spending. Only a small portion of total health care costs are currently covered through national or State government budgets (e.g., Medicare, Medicaid), and there is no national healthcare system in the US [1]. Therefore, the value of healthcare expenditures has a less direct impact on the national budget than in other countries; health care is largely provided by a variety of for-profit commercial enterprises in the US (i.e., there are numerous health insurers, centers, clinics, hospitals, and providers who must each make a profit on the provision of health care services to the public). While some not-for-profit healthcare organizations exist, the majority of health care services in the US are part of a competitive free-market system in which competition is used to control costs due to supply and demand; however, there are market forces in place for health care that cause it to perform differently from other commodities. There remains an escalation of per-capita health care costs as the demand for health care continually increases with the aging population and with the increase in chronic diseases. Insurance can, in part, help to control the demand for health care through referral and pre-authorization processes, but this can only be applied to about half the US population. In the US approximately 28% are uninsured and an additional 20% are underinsured, limiting the influence of insurers to control health care resource utilization [5]. However, these numbers are declining with the implementation of the Affordable Care Act, which provides access to health care insurance to those not covered by their employers. In part due to incomplete insurance coverage, health care costs are a significant cause of individual bankruptcy in the US [6, 7]. Despite the burden of health care costs in the US, there remains a cultural attitude that prohibits the implementation of any type of cost per outcome access restriction and will make any legislation to restrict access to high-cost services a difficult strategy to propose by elected officials [8]. Not only does the US spend more than Japan in terms of per-capita health care expenditures [9], the US also spends approximately three times the amount for health care administration than does Japan [10].

In regions that require HTA research to be conducted for national approval or reimbursement decisions, there are standard guidelines and expectations for the way in which these studies are to be presented that differ country by country [e.g., The National Institute for Health and Care Excellence (NICE) in the UK, The Institute for Quality and Efficiency in Health Care (IQWiG) in Germany]. There are additionally international standards for the conduct of cost analyses that guide this type of research for academic and good research practice purposes [11, 12]. Additionally, there are guidelines for the transfer of analyses from one country to another [13]. However, where no regional requirements exist for HTA, such as in the US and Japan, there is no straightforward way to adjust models to apply to the different payer and regional needs, and researchers must make decisions about the inputs and unit costs that are most relevant to the target decision maker.

Two cost-minimization analyses were conducted in the US and in Japan. The US model was developed by RTI Health Solutions and was modified for Japan by PRIMA Consulting Group to inform evidence-based decision making regarding the outcomes of the PRONOUNCE trial (ClinicalTrials.gov identifier, NCT00948675) [14]. The PRONOUNCE trial was a phase III randomized trial comparing two first-line regimens followed by single-agent maintenance for advanced non-squamous non-small cell lung cancer (NSCLC) [15]. This phase III trial found no significant differences in progression-free survival (PFS) without grade 4 toxicity, PFS, overall survival (OS), objective response rate (ORR), or disease control rate (DCR) between pemetrexed plus carboplatin followed by single-agent pemetrexed (PemC) and paclitaxel plus carboplatin plus bevacizumab followed by single-agent bevacizumab (PCB). The two regimens were similar across all primary and secondary endpoints.

Based on the results of the PRONOUNCE trial, the US and Japan cost-minimization analyses were conducted to estimate the costs associated PemC versus PCB in each region, respectively. The US model was developed from a third-party payer perspective and the Japanese model was developed from the perspective of the hospital payer. This study was designed to compare these models and to explore the differences in these models to inform the considerations for transferability of economic evaluations between the US and Japan in the context of these two very different health care system and payment structures.

Methods

Briefly, the cost-minimization models were both developed with a focus on direct medical costs as described below for Japan and the US, respectively. The medical costs considered included the costs of drugs, drug administration and infusion, toxicities, post-discontinuation therapy, and supportive care. The unit costs of each input were multiplied by the amount of the resource used and then summed to obtain an overall cost for each treatment in each region. The details of this approach are presented in the subsections that follow.

Population

The Japan and US cost models were developed based on the population of the PRONOUNCE trial, which is described in detail elsewhere [15]. Eligible patients were age 18 years and older with a diagnosis of advanced NSCLC (stage IIIB or IV) that was not amenable to curative therapy; patients were not allowed to have received any prior systemic chemotherapy, immunotherapy, targeted therapy, or biological therapy for any stage of NSCLC; and patients had to have an Eastern Cooperative Oncology Group performance status of 0 or 1 at the time of study enrollment. Patients were randomized to receive four cycles of induction PemC or PCB followed by single-agent maintenance. Patients were to receive treatment until the time of disease progression. As stated earlier, there were no significant differences between treatment arms for any of the primary or secondary study aims.

Treatment Comparators and Time Horizon

Both models included costs associated with the PRONOUNCE trial regimens of PemC or PCB. Patients began induction therapy with one of these two regimens; patients with complete response, partial response, or stable disease after four cycles continued onto single-agent maintenance therapy. The PemC group received pemetrexed 500 mg/m2 plus carboplatin area under the curve (AUC) 6 every 21 days for 4 cycles, followed by pemetrexed 500 mg/m2 every 21 days until progressive disease or treatment discontinuation. The PCB group received paclitaxel 200 mg/m2 plus carboplatin AUC 6 plus bevacizumab 15 mg/kg every 21 days for four cycles followed by bevacizumab 15 mg/kg every 21 days until progressive disease or treatment discontinuation [15].

The time horizon of both models was the length of the PRONOUNCE trial, which was approximately three years.

Drug Costs

To estimate the drug acquisition costs (induction and maintenance) for both models during the progression-free period, the number of administrations, as reported in the PRONOUNCE trial, was combined with average dosing and unit costs for each drug within each regimen. The mean number of administrations was 10.80 [standard error (SE) 0.66] for PemC and 10.5 (SE 0.54) for PCB. The mean number of induction and maintenance cycles was 3.45 and 6.80 for PemC, respectively, and 3.35 and 6.5 for PCB, respectively.

Costs in Japan

Direct medical care costs incurred within Japanese hospitals were considered for patients treated with the two study regimens. Unit costs included in the model are direct medical costs and are categorized as drug acquisition costs, administration costs, transfusion costs, and toxicity costs from the Hokenyaku Jiten (保険薬辞典), published by Jihou (じほう) and International Classification of Diseases, 9th Revision, Clinical Modification (ICD-9-CM) costs (Table 1). Billing arrangements were based on the payment structure in Japan for chemotherapy treatment. A case-mix system, known as Diagnosis Procedure Combination (DPC) has been introduced, which has helped in implementing a standardized electronic claim system, with keywords that provide transparency and accountability. The DPC is a case-mix system, which is comparable to the Diagnosis-Related Groups used in Medicare in the US. The DPC comprises 18 Major Diagnosis Categories, 520 diagnostic groups and 2658 case-mix groups. The diagnosis and comorbidities/complications are coded using the International Classification of Diseases, 10th Revision, Clinical Modification (ICD-10-CM) scheme, while the procedures are coded using the Japanese Procedure Codes as defined in the fee schedule of the national health insurance system [16]. For the Japan model, expert clinical opinion of three oncologists provided costing estimates on the clinical care of the toxicity.

Table 1 Inputs for the US and Japan cost models

Costs in the US

Direct medical costs from a US third-party perspective are estimated for the progression-free period as well as for the post-progression period. Unit costs included in the model are direct medical costs and are categorized as drug acquisition costs, administration costs, transfusion costs, and toxicity costs from published sources including the Centers for Medicare and Medicaid Services [17], the Red Book from Truven Analytics [18], and Health Care Utilization Project databases [19]. In the US, the cost of toxicity was assumed based on hospitalization costs associated with ICD-9-CM codes reflective of the toxicity.

Treatment Patterns for Both Models

Specific to this model is inputs related directly to the study treatments, post-discontinuation therapy, other medical costs, and unit costs. The clinical trial provided data related to study treatment and post-discontinuation therapy (Table 2). Data from the clinical trial included post-discontinuation therapy, utilization of granulocyte-colony stimulating factors (G-CSFs), rate of transfusions, and utilization of erythropoiesis-stimulating agents (ESAs) [15]. The mean erlotinib dose was assumed to be 150 mg per day as recommended in the package insert and the standard error was assumed to be 10% of the mean. The mean docetaxel dose was calculated from the recommended dose of 75 mg/m2 and the standard error was assumed to be 10% of the mean. For this calculation, the average body surface area of lung cancer patients was taken from the PRONOUNCE trial [15]. Due to lack of available data on the amount of use of post-progression treatments from PRONOUNCE (number of cycles, combination regimens), all intravenous systemic therapy in subsequent lines was assumed to be for three cycles and administration costs were applied for each component individually. Oral erlotinib was assumed to be dosed for 3 cycles (21 days each) and incur zero administration costs. The systemic therapy dosing inputs for each regimen are also shown in Table 2. Additional treatment pattern assumptions included in both models are the following: patients are limited to one transfusion of each type; ESAs are used every other cycle until the end of induction therapy; epoetin alfa dosing differs by brand name–Procrit® (Janssen Products, LP) is dosed at 150 units/kg three times every other week and Epogen® (Amgen) is dosed at 100 units/kg three times every other week; darbepoetin alfa is dosed at 500 µg every 3 weeks; and G-CSFs are given every cycle until the end of induction therapy and is injected 6 mg every cycle.

Table 2 Study treatment and post-discontinuation therapy based on the PRONOUNCE trial (US and Japan models) [15]

In both cost comparison models, one-way sensitivity analyses were conducted to examine the effect of individual parameter changes on model results. Additionally, probabilistic sensitivity analyses using Monte Carlo methods (10,000 iterations) were conducted to model the joint uncertainty around model parameters while scenario analyses tested alternative model assumptions regarding costs. Finally, a threshold analysis was conducted to determine whether the additional increase in price for pemetrexed and a decrease in price for bevacizumab would result in equivalent costs. Scenario analyses were conducted around those cost parameters wherein assumptions had to be made regarding utilization patterns for PemC and PBC, including: (1) no subsequent therapy costs were attributed to either PemC or PBC after disease progression; (2) no subsequent therapy, ESA, G-CSF, or transfusion costs were incurred by either arm; (3) no subsequent therapy, ESA, G-CSF, transfusion, or toxicity costs were incurred by either arm; and (4) only the cost of induction and maintenance drugs for PemC or PBC were modeled. The US and Japan models were designed and conducted in concordance with International Society For Pharmacoeconomics and Outcomes Research (ISPOR) good practice guidelines [12, 13].

Cross-Country Comparison

The two models were compared across key input and transferability factors as identified in ISPOR guidance documents [12, 13]. These points included: the decision problem; regional treatment patterns; cost inputs; use of local data; sensitivity analysis parameters; base case results; and results of probabilistic and sensitivity analyses. As both models were completed in 2013, the Japanese Yen to US dollar comparisons were based on a 2013 exchange rate monthly average of 97.63 Yen per US $1 [20].

Compliance with Ethical Guidelines

This analysis in this article is based on previously conducted studies and does not involve any new studies of human or animal subjects performed by any of the authors.

Results

Both the US and Japan models approached the decision problem of cost analysis in the setting of a randomized trial (PRONOUNCE) in which there were no statistically significant findings for the study efficacy outcomes. Treatment patterns were primarily based on the clinical trial interventions, with the exception of the care for toxicity, which was not mandated by the phase III study protocol. The estimated costs of grade 3–4 toxicity management were consistently higher in the US than in Japan, ranging from a difference of $4499 for vomiting to a difference in cost of $25,728 for thromboembolism. Cost inputs were consistently identified, but cost per mg varied between countries; bevacizumab and erlotinib cost more in the US than in Japan and all other chemotherapy agents cost more in Japan than in the US.

A comparison of the base case scenarios for the US and Japan models is presented in Table 3. Despite the differences in cost estimates of the key cost parameters (drug costs and toxicity costs), the direction of the models was consistent in the base case, with the exception of the costs of chemotherapy infusion, which were lower for PCB in the Japan model while higher for PCB in the US model. This is likely in part due to the additional infusion time costs that can be billed for drug administration in the US (Table 1). In both models, drug acquisition costs were identified in sensitivity analyses as the primary contributor to the model outcomes (Fig. 1). Other factors that influenced outcomes in both models included G-CSF use and duration of therapy. Other than input costs, the parameters used for sensitivity analyses were consistent between models.

Table 3 Base case results
Fig. 1
figure 1

One-way sensitivity analyses for US (above) and Japan (below). G-CSF Granulocyte-colony stimulating factor, PCB Induction bevacizumab plus paclitaxel plus carboplatin followed by single-agent bevacizumab, PemC Induction pemetrexed plus carboplatin followed by single-agent pemetrexed

Probabilistic sensitivity analyses found directionally similar results despite the country-specific cost parameters used in the models. Figure 2 shows the mean, median, and distribution of total costs when the range of input parameters is varied. The cost ranges of PemC and PCB were more similar in the US, where the 25th and 75th percentiles overlap. In comparison, for Japan there is a separation of these ranges between PemC and PCB.

Fig. 2
figure 2

Probabilistic sensitivity analyses for the US (a) and Japan (b) models, mean (diamond), median (bar), range (error bars) and 25–75% quartile (box height) total costs based on 10,000 iterations. PCB Induction bevacizumab plus paclitaxel plus carboplatin followed by single-agent bevacizumab, PemC Induction pemetrexed plus carboplatin followed by single-agent pemetrexed

In the US, increasing the cost per mg for pemetrexed by 43% (i.e., to $8.29 per mg) resulted in PemC and PCB being approximately equivalent (PemC total costs = $73,974; PCB total costs = $73,440). Further increases in unit costs to pemetrexed would lead to PCB being cost saving. Similarly, decreasing the cost per mg for bevacizumab by 30% (i.e., to $4.35 per mg) resulted in PemC and PCB being approximately equivalent (PemC total cost = $57,164; PCB total costs = $56,899). Further decreases in unit costs to bevacizumab would lead to PCB being cost saving. In Japan, greater price changes were needed to reach these thresholds. Increasing the cost per mg for pemetrexed by 70% (to ¥612.77 per mg/US$6.28 per mg) resulted in PemC and PCB being approximately equivalent (PemC total costs = ¥4,791,232/US$49,075; PCB total costs = ¥4,824,345/US$49,414). Further increases in unit costs to pemetrexed would lead to PCB being cost saving. Similarly, decreasing the cost per mg for bevacizumab by 53% (to ¥229.90 per mg/US$2.35 per mg) resulted in PemC and PCB being approximately equivalent (PemC total cost = ¥3,084,146/US$31,590; PCB total costs = ¥3,028,541/US$31,021). Further decreases in unit costs to bevacizumab would lead to PCB being cost saving.

Lastly, a comparison of the scenario analyses is presented in Table 4. In Japan there was a greater price differential leading to a higher probability of cost savings with PemC than in the US model, although both found PemC to be more likely to be less costly across all proposed scenarios.

Table 4 Scenario analyses, US and Japan models

Discussion

In a comparison of health care spending across 13 industrialized nations, the US was found to have the highest per-capita spending on health care, while Japan had the lowest [9]. The current cost comparison study was designed to take into account these large systematic differences in national health care spending to understand the implications at the level of a drug to drug comparison. Based on this assessment, recommendations are made regarding the key considerations when adapting clinical trial-based models from one country to another. Unless cost structures vary dramatically between the agents under consideration from country to country, cost analyses may be expected to be directionally consistent given the underlying clinical trial parameters that remain unchanged. However, even small price differences between countries can have a relatively large impact on probabilistic and threshold analyses.

There are a number of limitations of the models used in this cross-country comparison, as well as for the comparison between these US and Japan-focused models. First, these cost analyses are not likely to be directly generalizable to a US payer or to the Japanese national health care system, as the models were based on the population and treatments used in the PRONOUNCE clinical trial. Specifically, the US model was designed from a third-party payer perspective and the Japan model from a hospital perspective. In the US model, hospitalizations were not included directly in the analysis as these were incorporated as part of the cost of toxicities. The impact of non-toxicity related hospitalizations is not accounted for in the US model. While costs were extracted from publicly available sources for both models, in the US pricing structures differ across organizations and the publicly available costs may not reflect those used by private insurance companies or for reimbursement. For the Japan model, the impact of private insurance costs was not included. However, sensitivity analyses were consistent across a wide variation in possible drug acquisition prices in the US and Japan. Additionally, the dose of docetaxel in this model was 75 mg/m2, which while common in the US, may be higher than the 60 mg/m2 dose that is more commonly used in Japan. As a result, the cost of docetaxel may be slightly higher than expected for this region. However, the input costs of docetaxel for the additional 15 mg/m2 are very small given the generic costs of this agent and the use of 75 versus 60 mg/m2 does not impact any of the results from this study.

There are other factors that may influence economic modeling across countries that could not be directly compared in this study. This study compared cost analyses across regions from the same clinical trial, so differences in outcomes were not a factor. While in a clinical trial outcomes would be consistent regardless of the region to which the model were adapted, the use of quality-adjusted life years could impact the outcomes differently by country. For example, a country’s preference for use of societal or patient-based utility weights to adjust outcomes for quality and country-specific algorithms for EQ-5D elicited preference weights could affect the outcome differently by region. Other factors that may differ between countries that could not be compared in this study include discount rates, the inclusion versus exclusion of indirect costs, and procedures that may be conducted in an inpatient versus outpatient setting in the different regions.

Despite these limitations, this comparison demonstrates how even moderate differences in pricing may affect outcomes. While the direction of these two models remained consistent between the US and Japan, there were differences in results, with the cost savings in Japan being more robust than in the US model. This was a cost analysis; however, the results of the cross-country comparison would be unchanged in a cost-effectiveness analysis of life-years gained based on clinical trial data, as the survival outcomes would not vary between models even if they differ between treatment arms. When adapting or applying a model from one country to another, the factors outlined in this report should be taken into consideration for how the implications of one model may or may not apply to another region.

This study was limited to the time frame of the PRONOUNCE study. Future research is needed to generate more generalizable cost comparison data between these regimens using real-world data. Such a study could demonstrate how the cost and outcomes of these regimens may compare in an uncontrolled setting. The results of this study will be more applicable to settings in which clinical practice patterns follows the use of these regimens as prescribed for the PRONOUNCE trial and are administered within the populations that were eligible for enrollment to the study.