Background

Idiopathic pulmonary fibrosis (IPF) is a chronic, progressive, and fatal lung disease for which there is no known cause or proven effective therapy [1, 2]. Pirfenidone (5-methyl-1-phenyl-2-[1H]-pyridone; Shionogi & Co., Ltd., Osaka, Japan; MARNAC Inc., Dallas, TX, USA) [36] is a pyridone compound with therapeutic potential for IPF that has been shown in animal models to have wide-ranging effects including antifibrotic, anti-inflammatory and antioxidant activity, although its precise mode of action is unknown [2, 711]. A multi-centere, double-blind, placebo-controlled, randomized phase III clinical trial was conducted in Japanese patients with IPF to determine the efficacy and safety of pirfenidone over 52 weeks [12]. Significant differences were observed in the decline of vital capacity (VC; primary endpoint) between placebo group and high-dose (1800 mg/day) group; and in the secondary end point, the progression free survival (PFS) time, between the two groups. Treatment with pirfenidone was associated with a decreased rate of decline in VC and increased the PFS time over 52 weeks.

A 10% change in forced VC (FVC) have been reported to be a promising prognostic indicator, because patients with ≥ 10% decline in FVC within 6 or 12 months have a poor prognosis [1315]. In the treatment guidelines published by the American Thoracic Society (ATS)/European Respiratory Society (ERS) as well, a ≥ 10% change in FVC and ≥ 15% change in diffusing capacity of the lung for carbon monoxide (DLCO) are described as indices of improvement or worsening of disease [16]. To evaluate changes over a period from 6 months to 1 year, however, the method using a 10% change in FVC as an index is not sensitive enough and may not be suitable for actual clinical setting. Recently, Zappala et al. have reported that marginal decline in FVC is associated with a poor outcome in IPF [17]. In this report, the authors demonstrated that IPF patients had a significantly poor prognosis when the decline in FVC after 6 months was either 5% to 10% or ≥ 10%. This information is considered useful for selecting patients with progressive disease and evaluating therapeutic effects in clinical studies.

Based on this report, we reviewed the efficacy of pirfenidone in the phase III trial in an exploratory manner using a 5% change in VC as indices, evaluated the coincidence of the ratings based on 5% change in VC between months 3 and 12, and examined the usefulness and significance of the 5% change.

Methods

Overall Study Design

This study was a multicentre, double-blind, randomized, placebo-controlled trial. The diagnosis of IPF was in accordance with the ATS/ERS Consensus statement [16] and 4th version of the guideline of clinical diagnostic criteria for idiopathic interstitial pneumonia in Japan [18]. Eligible patients were adults (20 to 75 years old) with IPF diagnosis based on above criteria and meeting the following SpO2 criteria: 1) demonstrate oxygen desaturation of > 5% difference between resting SpO2 and the lowest SpO2 during a 6-minute steady-state exercise test (6MET), and 2) the lowest SpO2 during the 6MET > 85% while breathing air. Using the data in our pirfenidone phase III trial [12], we performed a series of exploratory analyses of physiologic variables and characteristics in patients receiving high-dose pirfenidone [1800 mg/day], low-dose pirfenidone [1200 mg/day] or placebo.

Setting, Participants, and Randomization

In this phase III study, 325 patients were screened at 73 centers in Japan, and 275 patients were randomized to one of the three groups: the high-dose, low-dose and placebo groups. Of the 275 patients, 267 (108, 55 and 104 patients in the high-dose, low-dose and placebo groups, respectively) were deemed eligible for the full analysis set (FAS). Eight patients were excluded due to having no post-baseline data.

Measurements

The primary endpoint was the change in VC from baseline to Week 52. Secondary endpoints were PFS time and the change in the lowest SpO2 during 6MET. VC was measured every 4 weeks, while the lowest SpO2 during the 6MET and other PFTs were determined every 12 weeks.

Statistical Analysis

In order to examine the characteristics of the improvement ratings and PFS based on 5% change in VC in the comparison of efficacy among treatment groups, and the clinical significance of the 5% decline in VC at month 3, we performed following analyses. Significance level of tests was set at 0.1 (two-sided) according to the phase III study [12].

• Categorical analysis based on 5% change in VC

Improvement ratings were defined based on 5% relative changes in absolute VC from baseline as "improved (≥ 5% increase)", "stable (< 5% change)", and "worsened (VC ≥ 5% decrease)", using VC values measured at 12, 28, 40, and 52 weeks after the start of treatment, and these ratings were used as those at months 3, 6, 9, and 12, respectively. Then, the distributions of the improvement ratings were compared between, high-dose pirfenidone (n = 108) and placebo (n = 104) groups, and (high- and low-dose) pirfenidone (n = 163) and placebo (n = 104) groups, with Wilcoxon rank sum test. The risk ratio was also calculated as the ratio of proportion of "worsened" in pirfenidone group to the proportion in placebo group at each time point. The principle of the last observation carried forward (LOCF) was adopted to impute missing values if patient data were available for ≥ 4 weeks after the baseline. The number of patients prematurely dropped and for whom missing observations were imputed was shown in online supplemental materials of the preceding reports in details [12, 19].

• Comparison of PFS times based on 5% decline in VC or death

PFS times by definition of disease progression as death or ≥ 5% relative decline in absolute VC were obtained. (In our previous paper, we used ≥ 10% instead of ≥ 5% decline in VC to define PFS times [12].) Then, the cumulative PFS rates were estimated with Kaplan-Meier (K-M) method and the distributions of PFS times were compared with log-rank test between high-dose pirfenidone and placebo groups, and low-dose pirfenidone and placebo groups. In addition, the disease progression was defined also by ≥ 5% decline in VC on two consecutive data points or death and similar analyses of PFS times thus defined were performed.

• Coincidence of the improvement ratings based on 5% change in VC at months 3 and 12, in terms of positive and negative predictive values

In order to examine the coincidence of the improvement ratings at month 3 and 12, that were derived as shown in the subsection "Categorical analysis based on 5% change in VC", we calculated positive and negative predictive values in high- and low-dose pirfenidone and placebo groups, and compared the positive and negative predictive values between the 2 (or pirfenidone and placebo) groups. Then, "worsened" and "non-worsened (stable or improved)" were considered "positive" and "negative", respectively.

• Comparison of the baseline characteristics between 'worsened' and 'non-worsened' patients at month 3

To examine the profiles of patients with ≥ 5% and < 5% decline in VC ("worsened" and "non-worsened" patients) at month 3, the baseline characteristics (i.e. age, body mass index (BMI), alveolar-arterial oxygen tension (PaO2), SpO2, VC, %VC, total lung capacity (TLC), %TLC, DLCO, %DLCO, KL-6, surfactant protein (SP)-A, SP-D, and dyspnea in daily living assessed with Hugh-Jones (H-J) classification [20]) between "worsened" and "non-worsened" patients at month 3 were compared with Welch's t-test.

• Comparison of the clinical outcome after 1 year between 'worsened' and 'non-worsened' patients at month 3

The clinical outcome (i.e. H-J classification, death, and acute exacerbation) after 1 year were compared between "worsened" and "non-worsened" patients at month 3. Analysis of the H-J classification was performed with Welch's T-test. Analyses of the mortality ratio and incidence of acute exacerbation were with Fisher's exact test.

• Comparison of PFS times with origin at month 3 between 'worsened' and 'non-worsened' patients at month 3

PFS times with origin at month 3 were obtained in a similar manner as described above. Then, the cumulative PFS rates were estimated with K-M method and the distributions of PFS times were compared with log-rank test between "worsened" and "non-worsened" patients at month 3.

Results

Categorical analysis based on 5% change in VC

Improvement ratings (improved, stable, worsened) based on 5% relative change in absolute VC at months 3, 6, 9 and 12 are shown in Figures 1-a (for high-dose pirfenidone and placebo groups) and 1-b (for high- and low-dose pirfenidone and placebo groups). Significant differences in the distributions of the ratings were consistently observed between high-dose pirfenidone and placebo groups (p = 0.0136, 0.0447, 0.0166, and 0.0053, Risk ratio; 0.578, 0.640, 0.671, and 0.665 at months 3, 6, 9, and 12, respectively) (Figure 1-a). Significant differences were also seen between high- and low- dose pirfenidone and placebo groups (p = 0.0064, 0.0381, 0.0091, and 0.0010, Risk ratio; 0.561, 0.652, 0.674, and 0.642 at months 3, 6, 9, and 12, respectively) (Figure 1-b), and between low-dose pirfenidone and placebo groups (data not shown). At months 6, 9, and 12, the risk ratios in (high- and low-dose) pirfenidone group to those in placebo group were approximately 65%., and the risks to be judged 'worsened' were consistently lower in pirfenidone group by approximately 35%.

Figure 1
figure 1

Categorical analysis based on 5% changes in VC at months 3, 6, 9, and 12. Improvement ratings based on 5% changes in VC were defined as "improved (VC 5% increase)", "stable (VC < 5% change)", and "worsened (VC 5% decrease)", using VC values measured at months 3, 6, 9, and 12. a) high-dose vs. placebo groups, b) pirfenidone-treated (high + low-dose) vs. placebo groups. The results are shown by the frequencies of improved (white areas), stable (gray areas), and deteriorated (black areas). P-values by Wilcoxon's test are indicated at the right.

Evaluation using modified progression-free survival based on 5% decline in VC or death

The modified progression of disease was defined by a ≥ 5% decline in absolute VC from baseline or death. K-M plots of PFS times based on the definition and the results of comparison of the distributions of PFS times among the groups with log-rank test are shown in Figure 2-a. Significant differences were shown in the distributions of PFS times between high-dose and placebo groups (p = 0.0149), and between low-dose and placebo groups (p = 0.0034) (Figure 2-a), and between (high-dose and low-dose) pirfenidone and placebo groups (p = 0.0015) (data not shown).

Figure 2
figure 2

Kaplan-Meier plot of Progression-Free Survival (PFS) times in groups of IPF patients. a) The disease progression was defined by a ≥ 5% decline in VC from baseline or death. b) The disease progression was defined by a ≥ 5% decline in VC from baseline on two consecutive occasions or death. Solid line: high-dose; broken line: low-dose; bold broken line: placebo. The distribution of PFS times were compared with log-rank test.

The progression of disease was also defined by ≥ 5% decline in VC on two consecutive data points or death, and K-M plots of the PFS times thus defined and the results of comparison with log-rank test are shown in Figure 2-b. Significant differences in the PFS times were seen between high-dose and placebo groups (p = 0.0011), between low-dose and placebo groups (p = 0.0349) (Figure 2-b), and between (high- and low-dose) pirfenidone and placebo groups (p = 0.0006) (data not shown).

Positive predictive value, negative predictive value with the ratings at month 3

Positive and negative predictive values with the ratings at month 3 in the prediction of those at month 12 in placebo and pirfendone (high- + low-dose) groups are shown in Table 1. In the placebo group, a ≥ 5% decline in VC at month 3 was still present at month 12 at highly rate (positive predictive value; 86.1% (31/36)) and no decline at month 3 was stable at month 12 at a rate of about 50% (negative predictive value; 50.8% (34/67)). On the other hand, in the treated (high- and low-dose pirfenidone) groups, decline at month 3 was still highly present (positive predictive value; 87.1% (27/31), nearly equal to one in the placebo group), and no decline at month 3 was also still stable at month 12 in relatively highly rate (negative predictive value; 71.7% (91/127) (Table 1). To put it briefly, the positive predictive values for pirfenidone and placebo groups were 87.1% and 86.1% respectively, and the difference was not significant. On the other hand, the negative predictive values for pirfenidone and placebo groups were 71.7% and 50.8%, respectively, and significant difference was seen (p = 0.0046).

Table 1 Positive and negative predictive values of the ratings at month 3 in the prediction of the ratings at month 12

Comparison of the baseline characteristics between 'worsened' and 'non-worsened' patients at month 3

The baseline characteristics between 'worsened' and 'non-worsened' patients at month 3 were compared. Patients with VC declined by 5% at month 3 generally had lower means of BMI, PaO2, VC, %VC, TLC, %TLC, and DLCO at baseline (p = 0.0011, 0.0047, 0.0036, 0.0127, 0.0219, 0.0722, 0.0639, respectively), and had higher means of SP-A, SP-D and H-J classification score at baseline (p = 0.0281, 0.0344, 0.0765, respectively) (Table 2).

Table 2 Summary statistics of baseline characteristics for patients with ≥ 5% and < 5% decline in VC at month 3

Comparison of the clinical outcome after 1 year between 'worsened' and 'non-worsened' patients at month 3

We compared the change in H-J classification score from baseline to month 12 with t-test between 2 classes of patients, i.e., those with "worsened (VC ≥ 5% decrease)" and others with "non-worsened (VC < 5% decrease)" at month 3. As a result, significant difference was seen for H-J classification score (p = 0.0002) (Table 3). Additionally, mortality rates for the patients with "non-worsened" and those with "worsened" at month 3 were 2.0% (4/194) and 9.0% (6/67), respectively, and significant difference was recognized (p = 0.0203). Marginal trend was also seen in the prevalence of acute exacerbation between the 2 classes of patients (p = 0.1031) (Table 4).

Table 3 Outcome of patients; Change from baseline to month 12 in H-J classification for patients with ≥ 5% and < 5% decline in VC
Table 4 Outcome after month 12; Mortality ratio and incidence of acute exacerbation in patients with ≥ 5% and < 5% decline in VC

Comparison of PFS times with origin at month 3 between 'worsened' and 'non-worsened' patients at month 3

K-M plot of the PFS times with origin at month 3 for patients with and without 5% decline of VC at month 3, added the result of log-rank test, is shown in Figure 3. There was no significant difference in the distributions of PFS times between the 2 classes of patients (p = 0.8835).

Figure 3
figure 3

K-M plot of PFS times with origin at month 3 in groups of patients with and without 5% decline in VC at month 3. Solid line with closed circle: No decline in VC at month 3; broken line with plus: a ≥ decline in VC at month 3. P-value was 0.8835 with log-rank test.

Discussion

We report that the efficacy of pirfenidone in Japanese phase III trial was supported by the evaluation using the improvement ratings, PFS times and positive/negative predictive values based on 5% decline in VC. Further, the baseline characteristics of patients with ≥ 5% decline at month 3 were generally severe, and the clinical outcomes of those patients including mortality were also significantly worsened after 1 year.

According to a preceding report [12], comparison of the distributions of the improvement ratings (improved, stable, or worsened) based on 10% change in VC did not show significant differences between pirfenidone and placebo groups. The comparison of the ratings using 5% change in VC, however, revealed significant differences between pirfenidone and placebo groups at months 3, 6, 9 and 12 (Figure 1), and approximately 35% reduction in risk in this malignant disease would support the use of pirfenidone in clinical practice. Thus, when the 5% change in VC was used as an index, efficacy of the drug was evaluated with higher sensitivity than when the 10% change in VC was used. The 5% change in VC may seem only a slight change, but the annual decline in VC in the placebo group is said to be approximately 150 to 200 mL in many recent clinical trials [12, 2125]. In the phase III trial of pirfenidone, the annual decline in VC in the placebo group was 160 mL on average [12], and the mean baseline VC in the placebo group was 2472.3 mL, from which the annual rate of decline is calculated to be approximately 6.5%. That is, if a ≥ 10% change in VC is used as an index for evaluation over a period of a year, it may not be sensitive enough to detect efficacy of the drug, especially for changes within a shorter period of time such as 3 months and 6 months. Results of this sub analysis revealed that using a 5% change as an index improved the chances of detecting efficacy of the drug. Our results are considerably similar to those of extended analysis of the IFIGENIA study investigating the effect of N-acetylcysteine (NAC) in IPF, which also showed significance of a 5% threshold [26]. However, it should be noted that use of a smaller change as an index may require more accurate VC measurements.

According to the preceding report, the progression of disease was defined by the ≥ 10% decline in VC or death for evaluation of progression-free survival [12]. Results showed that the p-value of the difference between groups high-dose and placebo was 0.0280 and between groups low-dose and placebo was 0.0655. In this paper, the progression of disease was defined by the ≥ 5% decline in VC from baseline or death, and K-M plots were generated using thus defined PFS time. As a result, there were significant difference between groups high-dose and placebo and between groups low-dose and placebo (p = 0.0149 and p = 0.0034, respectively), (Figure 2-a) which seems to be more evident than those in the previous analysis by 10% decline [12]. When the progression of disease was defined by a ≥ 5% decline in VC from baseline on two successive occasions or death, the highly significant differences were also observed (Figure 2-b), which supported the result of Figure 2-a.

Early identification of the response to therapeutic medication provides a clue in clinical decision making on treatment policy. We analyzed the positive/negative predictive values using the improvement ratings of months 3 and 12 based on 5% decline in VC. From the results of the differences of negative predictive values between placebo (50.8%) and pirfenidone (71.7%) groups, the efficacy of pirfenidone was also demonstrated (p = 0.0046). Thus, about 70% of patients assessed as non-progression at month 3 in pirfenidone group might remain in the state at 1 year. However, the results of the positive predictive values of placebo and pirfenidone groups showed that both values were very high, i.e., 86.1% and 87.1%, respectively. These results showed that the progression detected at month 3 remained (not reversed) at month 12 in most cases. These analyses suggested the possibility of identifying whether patients respond to pirfenidone or not at early phase after intervention, and of motivating patients to continue medication.

On the other hand, it will be a crucial question whether treatment should be withdrawn in patients who decline by ≥ 5% in VC at month 3. Patients with VC declined by 5% at month 3 generally had lower means of PaO2, VC, %VC, TLC, %TLC, and DLCO at baseline, and had higher means of SP-A, SP-D and dyspnea in daily living assessed with H-J classification score at baseline (Table 2). It was suggested that those patients with impairment of these baseline characteristics may lead to be corresponded to relatively "rapid progressors" in IPF, and treatment of any additional therapy would be recommended as soon as allowed. The effect of additional therapy strategies, such as combination with NAC [22] or BIBF-1120 [27], should be addressed in further clinical trials.

In order to translate the 5% decline in VC into a clinical relevant outcome, we compared the clinical outcomes (dyspnea in daily living assessed with H-J classification, mortality rate, and incidence of acute exacerbation) between 2 classes of patients, i.e., those with "worsened (VC ≥ 5% decrease)" and others with "non-worsened (VC < 5% decrease)" at month 3 (Table 3, 4). In short, dyspnea in daily living and mortality rate of patients with worsened at month 3 were significantly worsened after 1 year. Similar trend was also seen in the prevalence of acute exacerbation between the 2 classes of patients, which marginally supported the significance of the 5% change in VC. We speculated that the patients with 5% decline in VC at month 3 have further progression more easily; however, PFS times with origin at month 3 were not different between patients with or without 5% decline in VC at month 3 (Figure 3). Namely, it is noted that declines in VC at month 3 do not mean the possibility of further progression in next 9 months, i.e., month 3 to 12. In summary, except for the results of PFS times, it was suggested that a 5% decline in VC at month 3 is a clinically meaningful indicator in IPF and may be a useful prognostic factor. As the potential limitation, it should be addressed that these analytical results were obtained by the small number of subjects with death or prevalence of acute exacerbation within a one year study period.

Conclusion

Results shown in this paper suggested that when 5% change in VC was used as an index instead of the 10% change, the efficacy of pirfenidone could be evaluated with higher sensitivity and robustness over the 12 month study. It was also shown by the results that the 5% change in VC at month 3 is suggested to be a clinically useful and significant promising prognostic factor of IPF.