FormalPara Key Points

This is the first study to validate claims-based algorithms for detecting patients with bone metastases in Japan.

We evaluated two algorithms based on (1) diagnosis codes alone and (2) diagnosis codes plus imaging test codes.

The diagnosis code-based algorithm had a positive predictive value of 84.1%.

The algorithm based on the combination of diagnosis and imaging test codes had a positive predictive value of 82.6%.

The diagnosis code-based algorithm demonstrated sufficient utility for identifying bone metastases in Japanese hospitals’ claims data.

1 Introduction

Routinely collected health data, or real-world data (RWD), are not originally collected for research, but they provide a large amount of data that can be utilized for research. Therefore, RWD are increasingly being used in clinical, epidemiological, and health economics studies. RWD include administrative claims data, electronic medical records, primary-care surveillance data, and disease registries [1]. Each data source has strengths and limitations [2]. Administrative claims data, for example, have advantages such as large sample size, high population representativeness, longitudinal nature, and cost-efficiency in data collection [2, 3]. In Japan, administrative claims data are commonly used in RWD studies.

Administrative claims data also pose challenges for research use [4]. In claims data studies, outcomes and exposures are defined based on claims records. However, because these records were initially input for reimbursement purposes, they may not necessarily indicate or capture the correct medical status of patients. Inaccuracy of these data can cause misclassification of outcomes and exposures, which can deteriorate the credibility of the study results [5]. To address this problem, researchers should use validated claims-based algorithms to identify the patients or outcomes [6]. In contrast to Western countries where algorithm validation studies have been aggressively conducted [7, 8], such validation is not yet a common practice in Japan [9], despite the growth of claims data studies. Algorithm validation studies should be conducted more often in Japan to enhance the quality of real-world evidence based on claims data studies.

Bone is a common site of cancer metastasis. Bone metastases may occur in any cancer [10], but are particularly common in breast and prostate cancers, followed by thyroid, kidney, and bronchus cancers [11]. Bone metastases can accompany skeletal-related events (SREs), such as pain, hypercalcemia, pathological fracture, spinal cord or nerve root compression [12, 13], thereby seriously affecting daily lives and impairing health-related quality of life [14, 15]. Although the direct impact of bone metastases on patients’ survival is unclear due to the multiple factors involved [10, 16], bone metastases or SREs are associated with increased mortality [17, 18]. With such an enormous clinical burden on patients, the appropriate management of patients with bone metastases, including the prevention of SREs, has been an important topic in oncology and public health research.

Outside Japan, multiple studies have evaluated the clinical and economic burden of bone metastases (and SREs) using administrative claims data, with some using diagnosis records alone to identify bone metastases [19,20,21,22,23]. Contrastingly, others have used a combination of diagnosis records and procedure or prescription records [24,25,26]. However, most studies did not use validated algorithms. Although a few Western studies have evaluated the validity of algorithms to identify bone metastases using claims data [27,28,29,30], such validation studies have yet to be conducted in Japan. A validated algorithm to identify these patients will generate high-quality, real-world evidence on this critical clinical condition in Japan.

Therefore, in this study, we evaluated the validity of the algorithms to identify patients with bone metastases using administrative claims data from a Japanese hospital. We evaluated the two candidate algorithms based either on (1) diagnosis records alone or (2) a combination of diagnosis and examination records, using the patient chart review as the gold standard.

2 Material and Methods

2.1 Study Design and Data Source

A cross-sectional study was conducted at a university hospital, the Juntendo University Hospital (Tokyo, Japan). Juntendo University Hospital is a designated regional cancer care hospital with a bed size of 1051, which treats approximately 32,800 inpatients and 984,000 outpatients in a year. This validation study used administrative claims data and electronic medical records for patients who visited the hospital during a 2-year study period from April 2017 to March 2019 and who had the International Statistical Classification of Diseases and Related Health Problems, 10th revision (ICD-10) codes of C79.5 (secondary malignant neoplasm of bone and bone marrow). In the administrative claims data, diagnoses are available in disease names, seven-digit Japanese claims codes (disease codes), and ICD-10 codes. Additionally, procedure names and nine-digit Japanese claims codes (procedure codes) were recorded for procedures and examinations.

This study was approved by the ethics committee of Juntendo University Hospital. As this study was a retrospective chart and claims data review, informed consent was not required, and hence was not obtained. However, information about the study, including the purpose and data use, was posted on the hospital’s website, ensuring that participants had the right to opt out.

2.2 Candidate Algorithms

Based on a review of previous literature [27, 28] and expert opinions, we derived two algorithms to identify patients with bone metastases:

  1. 1)

    Algorithm 1: Presence of at least one diagnosis record of bone metastases (a subset of the diseases with ICD-10 codes of C79.5) during the study period.

  2. 2)

    Algorithm 2: Presence of at least one diagnosis record of bone metastases, with a record of imaging test in the month or preceding month of the diagnosis of bone metastases.

Bone metastases were defined as a subset of the diseases with ICD-10 codes of C79.5 using the seven-digit disease codes for the diseases or conditions of bone metastases excluding bone marrow infiltration (Online Supplementary Material (OSM) Table 1); these diseases were all classified as the ICD-10 code of C79.5. A disease code marked as a suspected diagnosis (a disease name that is assigned to a test order, and remains as suspected until the diagnosis is confirmed by a doctor) was not considered evidence of bone metastasis.

In Algorithm 2, we added an imaging test condition to Algorithm 1, expecting it may increase the accuracy of identifying the target population. Imaging tests were defined as computed tomography (CT), magnetic resonance imaging (MRI), positron emission tomography (PET), or bone scintigraphy. The nine-digit procedure codes for these examinations are provided in OSM Table 2.

2.3 Study Population and Sampling

Of all patients in the obtained data, we identified patients 18 years or older who met the above-defined Algorithm 1 or Algorithm 2. As Algorithm 2 was more stringent, where an additional condition was included in Algorithm 1, patients for each algorithm were sampled sequentially. First, we randomly sampled 100 patients from patients identified by Algorithm 1. Next, of the 100 patients, 88 met the definition of Algorithm 2, and an additional 12 were sampled from patients identified by Algorithm 2 to sample 100 patients in total for Algorithm 2. We set the sample size to 100 because it is considered a sufficient number to evaluate the diagnostic accuracy of an algorithm in this sampling method (i.e., random sampling from the patients meeting the outcome) with an acceptable precision level [31].

2.4 Patient Chart Review

Chart review was used as the gold standard. Two radiologists independently reviewed the diagnostic imaging reports for each patient to ascertain whether the patient had bone metastases.

The evaluators reviewed every report of imaging tests (i.e., CT, MRI, PET, and bone scintigraphy) performed in the same or preceding month of the first diagnosis record of bone metastases. Each evaluator determined the presence of bone metastases in the patient if the report contained a statement indicating its presence (i.e., true positive); otherwise, the patient was judged to have no bone metastases (i.e., false positive). If their judgments agreed, a determination was made. If there was a discrepancy between the two judgments, the evaluators discussed and made a final judgment. If necessary, information on medical charts outside the review period was also reviewed. If the evaluators did not reach an agreed conclusion, the case was considered “unable to judge” and excluded from the analysis.

2.5 Statistical Analyses

Summary statistics were calculated for age, sex, diagnosis names, and imaging tests performed to describe the background characteristics of the study population. The validity of the two claims-based algorithms was evaluated using positive predictive values (PPVs). For each algorithm, the PPV was calculated as the proportion of bone metastasis cases judged by chart review (i.e., true positive cases) among all the patients reviewed (i.e., patients identified by the algorithm as having bone metastases). The 95% confidence interval (CI) of the PPV, assuming a binomial distribution, was calculated using Wilson’s method [32].

All statistical analyses were performed using R version 4.1.0 (R Foundation for Statistical Computing, Vienna, Austria).

3 Results

3.1 Characteristics of the Study Population

Among 2641 patients extracted from the hospital administrative claims data with records including ICD-10 codes of C79.5, we identified 715 patients with bone metastases using Algorithm 1 (i.e., based on diagnosis records alone) and 620 patients using Algorithm 2 (i.e., based on diagnosis records plus imaging test records). The background characteristics of the patients are summarized in Table 1. We randomly selected 100 patients for chart review for each group (Figs. 1, 2).

Table 1 Background characteristics of patients identified by Algorithm 1 and Algorithm 2
Fig. 1
figure 1

Flow diagram of patient selection (Algorithm 1: Presence of at least one diagnosis record of bone metastases during the study period)

Fig. 2
figure 2

Flow diagram of patient selection (Algorithm 2: Presence of at least one diagnosis record of bone metastases, with a record of imaging test in the month or preceding month of the diagnosis of bone metastases)

3.2 Patients Identified by Algorithm 1

Of the 100 patients, 18 were excluded from the analysis due to a lack of diagnostic imaging reports. Thus, the remaining 82 patients formed the Algorithm 1 cohort. The mean ± standard deviation (SD) age was 64.07 ± 13.74 years, and 65.85% were men (Table 1). Of these, 74 (90.24%) patients underwent CT scans, and 49 (59.76%), 27 (32.93%), and 23 (28.05%) had MRI, bone scintigraphy, and PET, respectively, in the month or preceding month of diagnosis. Among the overall 715 patients, 620 patients (86.7%) had records of the imaging tests, and among the 100 sampled patients, 88 patients (88.0%) had records. The characteristics of these 82 patients were similar to those of the overall 715 patients (Table 1).

3.3 Patients Identified by Algorithm 2

Of the 100 patients, eight patients were extracted from the analysis because the true diagnosis was impossible to judge by chart review. Thus, Algorithm 2 cohort consisted of 92 patients. The mean ± SD age was 63.75 ± 13.85 years, and 64.13% were men (Table 1). Proportions of patients who underwent imaging tests, CT, MRI, bone scintigraphy, and PET, were similar to those in the Algorithm 1 cohort (92.39%, 60.87%, 31.52%, and 27.17%, respectively). Notably, the background characteristics of this sampled population did not deviate from those of the overall patients (Table 1).

No notable differences existed between Algorithm 1 and Algortithm 2 cohorts.

3.4 Validity of the Algorithms

Of the 82 patients identified by Algorithm 1, 69 patients were true bone metastases based on chart review. Thus, the PPV was 84.1% (95% CI 74.5–90.6) (Table 2). Similarly, of the 92 patients identified by Algorithm 2, 76 patients had bone metastases based on chart review, which resulted in a PPV of 82.6% (95% CI 73.4–89.1)—adding a condition of imaging test to diagnosis records did not improve the PPV.

Table 2 Positive predictive values of claims-based algorithms to identify patients with bone metastases

4 Discussion

Using validated algorithms to reduce misclassification is crucial to obtain high-quality, real-world evidence based on administrative claims data. To the best of our knowledge, this is the first study to assess the validity of algorithms to identify patients with bone metastases using claims data from a large hospital in Japan, using chart review as the gold standard. We evaluated the two algorithms, one using diagnosis records alone and the other using diagnosis and imaging test records. Consequently, both demonstrated high PPVs for identifying bone metastases from claims data at this hospital.

Remarkably, even the most straightforward algorithm based solely on diagnosis codes had a high PPV of 84.1%. We expected that adding an imaging test would reduce the number of false-positive cases. However, the algorithm combining the diagnosis and imaging test records resulted in a PPV of 82.6%. Because bone metastasis is usually diagnosed based on symptoms and testing, including plain radiography, a diagnosis record of bone metastases may be highly reliable. This may have also been responsible for the high reliability of the diagnosis records in this study. At any rate, our results indicated that using diagnosis codes alone was sufficient to identify patients with bone metastases in this hospital’s claims data, and adding extra conditions was not worth the effort. This result was unexpected, given that stricter conditions can increase diagnostic accuracy for some diseases [33]. However, our results were welcome in terms of practicality and efficiency.

The two claims-based algorithms achieved high PPVs. Nevertheless, Algorithms 1 and 2 falsely included 13 and 16 patients, respectively, who did not have bone metastases according to the gold standard. One possible reason for this is that the disease name was recorded for examination purposes. A provisional or suspected diagnosis is recorded as an “uncertain” diagnosis. Nonetheless, such an “uncertain” flag may sometimes not be recorded when ordering examinations in the next month to avoid cancellation. It can also be omitted in busy clinical settings [31]. Furthermore, they may have forgotten to correct disease names after a definitive diagnosis was determined, leaving incorrect disease names in claims records [33]. These are known causes of the inaccuracy of diagnosis in claims data in Japan and hard to avoid altogether. However, the high PPVs of our algorithms indicate an acceptably low rate of including these patients; thus, we consider that the impact of false-positive cases using our algorithm would not be significant.

Because PPVs depend on prevalence [34], and because the purposes of the algorithms developed may differ, there is no value in direct comparison of PPVs obtained in other validation studies conducted using different databases. Still, some previous studies have shown the potential of a simple algorithm based on diagnosis codes alone, similar to the present results. For example, Jensen et al. reported the utility of the single ICD-10 code (C79.5) to identify bone metastases using electronic medical registry data in Denmark, which had a high PPV (100% in prostate cancer and 86% in breast cancer) [27]. In the USA, the single ICD-9 code of 198.5 had a high PPV of 100% when used among inpatients with prostate cancer using Medicare data [30], although the same ICD-9 code had a slightly low PPV of 72.1% when used for breast cancer patients using claims data [28]. The combination of diagnosis and procedure records may be ideal in some settings. However, our results, in addition to these previous studies, suggest that they may not always be necessary.

This study focused on PPVs because a high PPV is important for identifying a group of people with a specific disease [34]. A high PPV is deemed important in comparative studies because it is considered that the non-differential sensitivity of disease misclassifications with high PPVs will not bias the risk ratio between groups [31]. We aimed to obtain algorithms to identify patients with bone metastases to examine treatment effects or the clinical/economic burden. Therefore, we prioritized maximizing the inclusion of patients who genuinely have this disease while minimizing the inclusion of those who do not—we did not aim to maximize the representativeness of the target population. We achieved our goal by obtaining algorithms with high PPVs. However, it should be noted that an algorithm with a high PPV has high specificity, potentially sacrificing its sensitivity [34], that is, it may miss some patients who do, indeed, have the disease. Thus, our algorithms may not be appropriate for estimating the incidence or prevalence of bone metastases. Researchers should use an algorithm whose high sensitivity is also confirmed for studies with such purposes.

One strength of this validation study was the use of chart review by radiologists as the gold standard, the results of which are considered the most reliable. For example, a previous study using cancer registry data as the gold standard to assess the validity of ICD-9 coding for bone metastases reported unsatisfactory results, and recommended the use of chart review as the gold standard [35]. Another strength was that our samples had high representativeness of the target population. All variables for patients’ background characteristics were similarly distributed between the overall and sampled populations.

In Japan, the linkage between databases is highly restricted [31]; currently available commercial claims databases are not linked to patients’ medical records. As the validity of claims-based coding cannot be evaluated using linked data, chart review is the most practical and reliable option in validation studies in Japan. However, it is time-consuming and may not always be feasible, although such validation should always be conducted. Therefore, our results are expected to provide valuable information regarding the accuracy of diagnosis codes to identify bone metastasis patients as the first step; these results could also serve as a reference for similar studies in the future and be combined with the new results to update the information. As a next step, studies in other facilities should focus on accumulating evidence or assessing parameters other than PPV (such as sensitivity or specificity) to promote the application of the algorithm in real-word practice.

This study has some limitations. First, the study was conducted using administrative claims data from a single hospital in Japan, which limits the generalizability of the results. Although guidelines are available for managing bone metastases in each malignancy, these are not strict, and several treatment options are available. Thus, treatment or diagnostic policies may differ between hospitals. The customs of administrative procedures may also vary at the individual and hospital levels. Therefore, our results may not apply to settings where the characteristics of hospitals or patients are primarily different from ours. Second, the prevalence of bone metastases might be higher in a large university hospital like Juntendo University Hospital than in other hospitals; consequently, such a high prevalence might have resulted in the high PPVs in our analyses. Third, as mentioned above, we did not evaluate the sensitivity of these algorithms. Our algorithms were intended to be used in comparative studies to examine treatment effects or clinical/economic burdens, but not in studies to estimate the incidence or prevalence of bone metastases, where further characterization of the algorithms (i.e., calculations of the sensitivity and specificity) is necessary. Fourth, for Algorithm 2, we sampled patients by a two-step procedure: 88 patients who met the definition of Algorithm 2 were selected from 100 patients for Algorithm 1, and 12 additional patients were randomly sampled from those identified by Algorithm 2, to obtain a total of 100 patients. Patients for Algorithm 2 could have been sampled in one step from 540 patients identified by Algorithm 2, but we took a two-step procedure instead because, from the viewpoint of feasibility, we intended to make the sample number the minimum required for sufficient precision. However, this procedure might not generate any additional bias because the population sampled by the two procedures are both considered to be random samples from patients identified by Algorithm 2 and are inherently the same. Despite these limitations, this was the first coding validation study to identify bone metastases using claims data in Japan, and we believe that our data will still be informative for researchers in this field.

5 Conclusion

In conclusion, we found that the two claims-based algorithms, one based on diagnosis records alone and the other based on diagnosis and imaging test records, had high PPVs of approximately 85% in identifying patients with bone metastases. Adding imaging test conditions did not improve the PPV, indicating that identification solely based on diagnosis records would be sufficient for use in this hospital’s claims data. Although the generalizability of the present study is limited, our results, for the first time, provide evidence of the utility of our coding for identifying bone metastases from a hospital’s claims data in Japan. We hope that this validated algorithm will enhance the credibility of RWD studies on this critical clinical condition in Japan.