Background

Venous thromboembolism (VTE) includes deep vein thrombosis (DVT) or pulmonary embolism (PE) and is an important cause of morbidity and mortality [1]. VTE is also associated with other conditions that influence patient’s mortality prognosis, in particular cancer [25]. VTE may complicate the course of a patient with known cancer, but it may also be its first manifestation [6]. According to a systematic review, up to 10% of patients presenting with idiopathic VTE are subsequently diagnosed with cancer during the first year of follow up [7, 8]. Moreover, mortality at one year is higher in patients with VTE that develop cancer compared to those that do not [5, 9, 10].

Suspicion of underlying cancer may lead clinicians to screen for cancer and provide closer surveillance following an acute episode of VTE [7, 9, 11, 12]. However, unselected screening can lead to a higher rate of false positive results, inducing unnecessary anxiety and increasing costs [13]. Conversely, no surveillance after the diagnosis of VTE may delay detection of potentially treatable cancers [8, 9]. At present, clinicians typically assess patients’ cancer risk after VTE using conventional approaches to cancer screening that are based on classic risk factors [1418]. Recent guidelines have proposed specific work up strategies for these patients including computed tomography [19]. However, little evidence exists to help target which individuals should undergo such screening from the entire population of patients with VTE.

We therefore sought to construct a clinical predictive score that could stratify patients according to their risk of subsequent cancer or death. Our overall goal was to identify patients that might benefit from a more intensive screening strategy and surveillance.

Methods

Study population

We conducted our study using an institutional registry of 1264 consecutive patients that were admitted between June 2006 and December 2011 to Hospital Italiano, a tertiary teaching hospital in Buenos Aires, Argentina [20]. All adult patients (both inpatients and outpatients, age > 17 years of age) presenting with a new diagnosis of VTE were included in the registry database (Microsoft ACCESS, Redmond, Washington) after providing informed consent. The study protocol was approved by the ethics review board of the Hospital Italiano de Buenos Aires. A full-time research fellow screened all patients at initial diagnosis and updated the database during all follow-up visits. The registry contains information on baseline demographics, clinical history and co-morbidities, physical examination, and laboratory and radiological data. It also contains information on vital status and cancer diagnosis during follow up; cancer diagnosis was ascertained from electronic charts, as a clinical or pathology-based diagnosis. The routine practice at the institution is to continue to follow these patients until death or until they are lost to follow-up. Frequency of follow up and cancer screening was left to the discretion of the individual physicians.

As patients with overt cancer that develop VTE are different from those that present with VTE as a first manifestation of malignancy we excluded patients with cancer diagnosis that preceded VTE, those that were diagnosed with VTE and cancer at the same time (during the same month) and patients who died during the first month following VTE diagnosis. Finally, we excluded those with less than one year of follow up. From the entire sample we created a derivation cohort by randomly selecting two-thirds of the patients, and the remaining third became the validation cohort.

Model development

We used available variables to construct models that would predict the outcome cancer (primary outcome) and cancer and death (secondary outcome). Death was included as part of the secondary composite outcome as it may act as a competing event regarding the development of cancer. We first selected potentially useful baseline characteristic predictor variables for the multivariable model based on clinical experience and previous literature. Candidate variables included demographic characteristics (age, sex), classic risk factors for thromboembolic disease (major surgery, previous VTE, family history), coexisting illnesses (Charlson comorbidity index score [21]), body mass index (BMI), and laboratory tests (albumin, hemoglobin). We dichotomized continuous variables using their median values as follows: age ≥ 70 years; score on the Charlson comorbidity index ≥ 2; albumin level ≤ 2.5 g/l. Variables were retained only if they remained associated with the primary outcome in a multivariable logistic model using the full model fit [22].

Score generation

We assigned point scores for each variable in the final model by rounding the corresponding coefficients to integers [23]. We then calculated a total score for every patient by adding the individual points for each risk factor that was present. We calculated sensitivity, specificity, negative and positive predictive values (with 95% confidence intervals) for each cut-off point of the score in order to predict cancer or death at one year [24]. We also calculated negative and positive likelihood ratios (with 95% confidence intervals) [25].

Validation of the prediction rule

We assessed calibration and discrimination in both the derivation and validation cohorts. Calibration was determined using the Hosmer-Lemeshow test [26] and compared the actual and predicted outcomes within each point stratum for the development and validation cohorts. We evaluated discrimination using receiver operating characteristic curves (ROC) [27]. We compared ROC curves for both cohorts according to the method described by Haney et al. [28].

Results

The institutional registry contained 1264 patients that were diagnosed with new VTE between June 2006 to December 2011, and complete follow-up information was available on 1211 (95.8%). Of these, we excluded 494 (40.8%) patients who had previously been diagnosed with cancer, 132 (10.9%) who died during the incident hospital admission or during the first month of follow-up, and 45 (3.7%) who were diagnosed with VTE during the last year of the study. A random selection of 349 (two thirds) of the 540 remaining patients comprised the derivation cohort and 191 patients (one third) comprised the validation cohort.

Patient characteristics

During one-year of follow-up, nearly one-quarter (92; 26.4%, 95% CI 21.4% - 30.6%) of patients died (83; 23.7%, 95% CI 18.5% – 27.4%) or developed cancer (32; 9.2%, 95% CI 18.5% – 27.4%). Lung cancer was the most common diagnosed malignancy (21.9%, 95% CI 7.6% - 36.2%) followed by haematogical disorders (18.7%, 95% CI 5.2% - 32.2%). Nearly one third of patients developed metastatic disease (Additional file 1). Patients with the primary outcome of cancer had more comorbidities and previous VTE, and were less likely to have had recent surgery (Table 1). The patients who developed cancer during follow-up had higher mortality than patients who did not develop cancer (71.9% vs. 18.9%; p < 0.0001) (Table 2).

Table 1 Characteristics of patients with versus without cancer at one year
Table 2 Characteristics of patients with versus without cancer or death

Score development

The multivariable logistic regression model to predict one-year risk of cancer retained the following variables: Charlson comorbidity score, previous VTE, and recent surgery. In the model predicting cancer or death, age and albumin were also retained (Additional file 1). The resulting score values derived from rounding the beta coefficients were the same for both outcomes (Table 3).

Table 3 Final scoring systems

Score performance

We estimated the predicted probability of developing the primary and secondary outcomes using a logistic regression model in both the derivation and validation cohorts (Additional file 1). Hosmer-Lemeshow goodness of fit testing showed good calibration (p=0.65 and p=0.94 in the derivation and validation cohorts, respectively). The final score to predict cancer alone had an AUC of 0.75 (95% CI 0.66-0.84) and 0.79 (95% CI 0.63-0.95) in the derivation and validation cohorts, respectively. The final score to predict the combined outcome of cancer and death had an area under the curve (AUC) of 0.72 (95% CI 0.66-0.78) and 0.71 (95% CI 0.63-0.79) in the derivation and validation cohorts, respectively (ROC curves in Additional file 1). The sensitivities, specificities, positive and negative predictive values, and likelihood ratios associated with each point of the final scores are shown in Table 4 and Table 5.

Table 4 Test performance for primary outcome (Cancer)
Table 5 Test performance for secondary outcome (Death or Cancer)

Discussion

We developed clinical scores to classify patients according to their risk of cancer, or of cancer and mortality, at one year of follow up after developing a new VTE. The final scores employ common and readily available clinical variables and can be easily calculated at the bedside at the time of VTE diagnosis. In our cohort, the scores had good discrimination and calibration, and could differentiate across a wide range of risks for developing cancer, from only 2% (0 points) to greater than 90% risk (5 points). In addition, our score was able to stratify patients’ cancer or mortality risk from 6% (0 points) to greater than 70 % (6 points or more). These simple scores therefore not only provide important prognostic information but might also be used to identify patients that would benefit from closer surveillance and additional investigations.

The ultimate goal of estimating prognosis is to improve clinical decision-making and thereby improve patient outcomes. Our scores may lead to the diagnosis of some malignancies at an earlier stage, and could therefore result in earlier cancer treatments. In addition, some experts have advocated for alterations to anticoagulation strategies in patients with VTE who also have underlying cancer [29]. Conversely, excluding patients who are at low risk for developing cancer or death from screening strategies and investigations should lead to fewer false positive results, avoid unnecessary treatment strategies, and reduce overall costs. Our score also identifies patients that are at higher risk of death, regardless of their risk of cancer, and this could in turn motivate clinicians to address other conditions such as chronic heart failure or coronary heart disease that might be contributing to this higher mortality risk. We provide an example of potential responses to different score results using hypothetical scenarios in Table 6.

Table 6 Possible clinical scenarios and application

Our study has several strengths. We used a large and comprehensive clinical dataset that was developed specifically to follow consecutive patients with newly diagnosed VTE. The initial evaluation and data collection occurred soon after the VTE diagnosis, increasing the clinical utility of our final scoring system. The loss to follow-up at one year remained very low, decreasing the risk of selection bias. Finally, our cohort includes a large number of patients from across Argentina and from different social backgrounds, increasing the generalizability of our final score.

Our study also has several limitations. We could only evaluate variables that were contained in our database, and it is likely that other clinical variables could increase the predictive accuracy of our score. However, the variables in our final scores are widely available and easily obtained, which should improve the external validity of our model. We included only baseline variables in our model, and were unable to evaluate characteristics that evolve over time and that might further influence a patient’s risk of cancer or death. Our model had high discrimination in both the derivation and validation cohort, similar to that observed for other widely used predictive models [30], but it still will lead to some misclassification of patients. In addition, our validation cohort was derived from the initial sample and was not an independent cohort; it is likely that some loss in discrimination will occur when our scores are applied in other populations. Another limitation is the lack of standardized cancer screening, making it possible that our study is biased by physicians’ decisions to request additional screening tests for patients having the same risk factors identified in our study. However, surveillance using radiological imaging was common throughout the study, with 80% and 34% of patients receiving chest and abdominal computed tomography, respectively, in the first year following VTE diagnosis. Although our scores should help physicians identify patients at higher risk of cancer, it remains unknown whether earlier diagnosis will lead to improved survival [31], especially considering that cancers associated with VTE often have a relatively poor prognosis [6]. Finally, interpreting intermediate risk scores is a challenge common to most predictive models; the optimal approach to surveillance and investigation of these patients is even more uncertain than for those at low or high risk.

Conclusion

We have developed a simple and clinically relevant score that can predict risk of developing cancer in patients with newly diagnosed VTE. This score could be used to help reassure low risk patients, or to identify high-risk patients that might benefit from increased surveillance and additional investigations. However, our tool should be validated in an externally derived cohort to evaluate its generalizability before it is routinely adopted into clinical practice.