Pulmonary arterial hypertension (PAH) remains an important cause of mortality and morbidity in systemic sclerosis (SSc). Classification of PAH as Group 1 within the Pulmonary Hypertension (PH) WHO clinical classification system has permitted inclusion of patients with SSc in numerous interventional trials and has resulted in the licensing of many agents, including endothelin receptor antagonists [1], type V cGMP phosphodiesterase inhibitors and prostacyclins, including parenteral and inhaled delivery systems. While this has been a fortunate circumstance for our patients, numerous critically important questions remain unaddressed. In general, although the recently updated classification of PH retains SSc cases within Group 1 [2], evidence suggests that patients with PAH related to SSc (PAH-SSc) show blunted responses to therapy when compared with those with idiopathic PAH, including key measures of outcome such as the six-minute walk test, time to clinical worsening and survival [3]. The very presence of SSc provides an enriched population at high risk of PAH and should offer the opportunity for early diagnosis, yet registry and centre-based data reveal no improvement in referral intervals. Part of this may reflect an increased understanding of the lack of sensitivity and specificity of echocardiography, particularly at the lower end of pulmonary pressures but also because of confounding issues posed by concomitant interstitial lung disease. Finally, patients with SSc tend to be under-represented in modern trials, which are typically rather short in duration (12 to 18 weeks), resulting in inadequacy of data to support definitive recommendations [4].

With this background, there has recently been a systematic effort to improve the assessment of PAH occurring in association with SSc. The main drivers for this have included a desire for better validated endpoints that could be used as a core set applied to clinical trials, the wish for a clinically meaningful endpoint that would reflect practice, and the need for less invasive longitudinal assessment tools that might replace right heart catheterisation (RHC) as the perceived gold standard test for PAH. At present, RHC is essential for diagnosis but there are questions about the feasibility of this as a tool to follow patients clinically over time. There is clear need for a non-invasive endpoint as well as for validation and critical analysis of the effectiveness of screening modalities. There are particular challenges in addressing this for PAH-SSc, a condition that requires multidisciplinary care and that may present and be followed up by a number of different subspecialists. Each will be an expert in their own field and be familiar with managing and interpreting certain investigations but there will be differences of opinion between the different experts as to what the best tests are and how they should be interpreted and used in practice. In addition, the tools that may be validated in other forms of PAH are unlikely to have been formally assessed in PAH-SSc, which has its own unique characteristics that may affect standard PAH outcome measures; an example would include the associated musculoskeletal manifestations. Moreover, there may be differences for other forms of connective tissue disease-associated PAH. There are biases introduced based upon different clinical experiences and also related to the familiarity with clinical trials.

An approach that has been used successfully in rheumatology and has a clearly defined framework is covered by the Outcome Measures in Rheumatology (OMERACT) methodology. This uses a standardised framework to assess potential disease measures for clinical trials to consider utility under the subcategories of the OMERACT filter [5]. The Expert Panel on Outcomes measures in PAH related to Systemic Sclerosis (EPOSS)-OMERACT group was established to begin to apply this approach to evaluating PAH-SSc. This group has integrated expertise in cardiology, pulmonary medicine, rheumatology and biostatistics as well as clinical trial design and outcome development and validation. It has applied the OMERACT filter to individual tools that could represent endpoints in trials and has critically reviewed the published literature to explore the extent to which outcomes have been validated. In addition, it has sought to develop consensus about individual outcomes. In particular, the EPOSS group has identified through a Delphi process a series of recommended domains and their assessment tools [6]. The data that could validate these tools have been considered systematically and this has led to a series of important observations. One of the goals was to identify a measure or series of measures that could replace RHC as the gold standard of assessment. These are significant achievements and have resulted in a series of relevant publications. So far, six substantive papers [611] have been published as a direct result of the EPOSS initiative, and more are expected.

Work is now underway and there will be attempts to validate the individual components and to review the available data that provide some validation. This is a daunting task as for many tools there are not sufficient results from research studies to undertake this. First attempts to evaluate the routine clinical tool of Doppler-echocardiography are testament to the challenge that lies ahead [7]. An important output of this exercise has been the definition of research agenda to prioritise effort in addressing the data that are available and determine the extent to which the available information from national (such as the French Intinair project [12], UK single centre registries and compERA-XL [13]) or from clinical trial datasets might be interrogated. One limitation of most clinical trials is that they usually include a minority, typically around 20%, of cases with connective tissue disease-associated PAH, and fewer with PAH-SSc. The EPOSS group provides a template for the type of international multidisciplinary approach that could tackle these important challenges.

In the meantime the clinical arena has moved on and a large number of major clinical trials in PAH include cases of PAH-SSc. There has been the strong suggestion that a composite endpoint that reflects clinical practice be used. This has become defined as the time to clinical worsening (TTCW). A formal measure of TTCW has emerged as an attractive composite endpoint that measures progression in PAH. It was originally included as a secondary end-point in several major clinical trials that led to licensing of PAH therapies based upon a primary endpoint of change in exercise capacity (the six-minute walk test). At face value it makes obvious sense, especially as a clinically meaningful endpoint that may be used in licensing and in post-licensing evaluation of therapies. However, the devil is in the detail. Different studies have used different components in the TTCW definition and there are major potential local differences in practice that may make a measure unworkable or unreliable in different centres. Thus, some centres have an outpatient ambulatory emphasis whereas others may often hospitalise cases of PAH. In addition, availability of therapies and expertise in procedures such as surgical intervention or transplantation may be relevant. Moreover, as discussed above, it is likely that different standards may be applicable for PAH-SSc versus idiopathic PAH due to co-morbidity and potential differences between outcome and progression of PAH and suitability for treatments. In the short term, TTCW is very likely to be adopted as a useful measure and one that is especially relevant in early stage disease where stabilisation can be a very appropriate management goal. However, this should serve as an impetus to further research to validate and understand individual components of TTCW and develop new and potentially better composite tools. In particular, there are self-evident reasons why some of the components of TTCW are likely to be unreliable or incomparable in PAH-SSc. Musculoskeletal involvement and co-morbidity, such as lung fibrosis or cardiac complications, are clearly likely to affect exercise capacity. One study suggests that musculoskeletal deconditioning is the major determinant of six-minute walk test distance [14], and no relationship with parameters of lung function has been shown [15]. There are multiple causes of disease-related mortality in SSc - for example, renal crisis, lung fibrosis and gut disease - and so mortality cannot be taken as a surrogate for PAH outcome. Finally, co-morbidity and age make PAH-SSc cases much less likely than idiopathic PAH to be referred for transplantation and even less likely to be transplanted. The impact of these individual component differences may be cancelled out in a composite score but this cannot be assumed. This should be considered especially in subtypes of PAH and so the research work now emerging from the EPOSS initiative is likely to be very relevant. In the meantime, the recent report from the Fourth World Congress in PH in Dana Point, USA, represents current best expert consensus on how to standardise and use TTCW as an outcome measure in PAH clinical trials [16]. It seems likely that TTCW will be a benchmark in future studies. It has already replaced exercise capacity assessed by the six-minute walk test distance, generally now there is a move towards a robust hemodynamic endpoint or the composite clinical measure.

So where does that leave the EPOSS initiative? The work could be regarded as done and the TTCW be adopted as a gold standard. RHC would remain for diagnosis but would only be performed later as directed clinically. But this would not be a correct approach for PAH-SSc. Table 1 highlights some specific aspects that would be relevant to the domains and measures that have been identified through the EPOSS initiative and that may be incorporated into composite endpoints, including TTCW. There is a strong need for systematic validation and much to be learnt from the defined research agendas along the way. The two concepts must co-exist but cannot do so without interplay so that both may inform the other. It can be argued that rigorous concerns about validity of endpoints could have severely impeded progress and treatment opportunities but the challenge must now be faced so that there is a real consensus that can be applied and eventuality the clinical needs of patients and the methodological needs of trialists and the exacting standards of the regulatory authorities that license new agents can all be met. For the time being TTCW is probably the most usable endpoint for clinical studies but the component terms need better standardisation and must be clearly defined. In the future, through initiatives such as EPOSS, these components can be validated in PAH-SSc and it is imperative that the challenge of this task is not used as justification for not addressing these important questions.

Table 1 Domains and measurement tools for the assessment of pulmonary arterial hypertension in systemic sclerosis