The diagnostic value of grey-scale inversion technique in chest radiography

Purpose We investigated whether the additional use of grey-scale inversion technique improves the interpretation of eight chest abnormalities, in terms of diagnostic performance and interobserver variability. Material and methods A total of 507 patients who underwent a chest computed tomography (CT) examination and a chest radiography (CXR) within 24 h were enrolled. CT was the standard of reference. Images were retrospectively reviewed for the presence of atelectasis, consolidation, interstitial abnormality, nodule, mass, pleural effusion, pneumothorax and rib fractures. Four CXR reading settings, involving 3 readers were organized: only standard; only inverted; standard followed by inverted; and inverted followed by standard. Sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV) and accuracy, assessed with the area under the curve (AUC), and their 95% confidence interval were calculated for each reader and setting. Interobserver agreement was tested by Cohen’s K test with quadratic weights (Kw) and its 95%CI. Results CXR sensitivity % for any finding was 35.1 (95% CI: 33 to 37) for setting 1, 35.9 (95% CI: 33 to 37), for setting 2, 32.59 (95% CI: 30 to 34) for setting 3, and 35.56 (95% CI: 33 to 37) for setting 4; specificity % 93.78 (95% CI: 91 to 95), 93.92 (95% CI: 91 to 95), 94.43 (95% CI: 92 to 96), 93.86 (95% CI: 91 to 95); PPV % 56.22 (95% CI: 54.2 to 58.2), 56.49 (95% CI: 54.5 to 58.5), 57.15 (95% CI: 55 to 59), 56.75 (95% CI: 54 to 58); NPV % 85.66 (95% CI: 83 to 87), 85.74 (95% CI: 83 to 87), 85.29 (95% CI: 83 to 87), 85.73 (95% CI: 83 to 87); AUC values 0.64 (95% CI: 0.62 to 0.66), 0.65 (95% CI: 0.63 to 0.67), 0.64 (95% CI: 0.62 to 0.66), 0.65 (95% CI: 0.63 to 0.67); Kw values 0.42 (95% CI: 0.4 to 0.44), 0.40 (95% CI: 0.38 to 0.42), 0.42 (95% CI: 0.4 to 0.44), 0.41 (95% CI: 0.39 to 0.43) for settings 1, 2, 3 and 4, respectively. Conclusions No significant advantages were observed in the use of grey-scale inversion technique neither over standard display mode nor in combination at the detection of eight chest abnormalities. Supplementary Information The online version contains supplementary material available at 10.1007/s11547-022-01453-0.


Introduction
Chest radiography (CXR) is generally considered entry level imaging to screen many pulmonary diseases with good performance as a screening uptake evaluation [1,2]. The interface between the bronchial tree, containing air, and structures with no air gives the radiographic image a natural contrast, used to advantage radiological interpreters (author radiologists) to depict abnormal findings [3]. These intrinsic anatomical features, along with continuous technical advancements in the field of digital radiography, have significantly contributed to make chest radiography one of the most requested radiological investigations [1,4].
Over the last decades, digital chest radiography has iteratively and incrementally improved, with numerous processing tools being developed to support radiologists in the detection of pathological findings [2,4,5]. Most of these tools have been implemented to improve nodule detection, including digital tomosynthesis [6][7][8], dual energy and temporal subtraction techniques [9][10][11], computer-aided detection systems [12,13] and dark-field CXR. More recently, dark-field CXR has been demonstrated to be a valuable complementary tool for the assessment of pulmonary infiltrates, cardiomegaly and hemopericardium [14,15]. Such techniques are not yet widely available, and their use requires further validation. In comparison, the grey-scale inversion technique is universally available, being a built-in feature on most Picture Archiving and Communication System (PACS) display workstations. Based on the evidence that viewing the inverted image (black on white) improves human contrast perception [16], grey-scale inversion has been proposed as a valid supplementary tool to increase the diagnostic accuracy of radiographic imaging [17][18][19][20][21]. In chest radiography, the diagnostic value of inverted images has been investigated mostly for parenchymal nodules [17,[22][23][24][25][26], pneumothorax [20] and rib fractures [27] detection. The clinical advantages of using this display method, however, are still debated, and no general consensus has been reached.
The purpose of this study is to investigate whether the additional use of grey-scale inversion technique improves the interpretation of the main chest abnormalities, in terms of both diagnostic performance and interobserver variability.

Ethics statement
This study was approved by the Institutional Review Board of the University Hospital of Parma (Prot. 51059). Given the retrospective nature of the study, informed consent was waived.

Study group
The study selection criteria were as follows: chest CT examination and CXR obtained within 24 h of each other, in patients older than 18 years of age admitted to the University Hospital of Parma between October 2017 and October 2019. CTs and CXRs images affected by motion artefacts or other technical limitations (e.g. chest structures only partially included within the CT acquisition volume or the CXR projection) were excluded. Chest CT served as standard of reference (CT technique is reported in Supplementary material).

CXR imaging technique
Posteroanterior (PA) and left-lateral (LL) images were obtained with the patient standing up and in full inspiration with three digital radiography systems (Axiom Aristos FX, Siemens Healthineers; Essenta DR, Philips and DigitalDiagnost, Philips). Acquisition parameters were as follows: 125 kV, 1.6 mAs, antiscatter grid with a 180cm focus-detector distance.
Anteroposterior (AP) images were acquired with the patient either lying down or sitting up with two computed radiography systems (Practix 33 Plus, Philips and Practix 300, Philips). Acquisition parameters were as follows: 95-98 kV, 3.2 mAs, with a 120cm focus-detector distance.
Images were visualized on a dedicated workstation (BARCO visualization system, Kortrijk, Belgium), and greyscale inversion was performed through a built-in software of our PACS workstations (suite Estensa, Esaote, Genova, Italy) (Figs. 1 and 2).

Data collection and interpretation
CXR-Images of CXR were retrieved from the local PACS and independently reviewed by one general radiologist with 18 years of experience (Reader 1) and two third-year radiology residents (Readers 2 and 3), for the presence of eight predefined findings: atelectasis, consolidation, interstitial abnormality, nodule, mass, pleural effusion, pneumothorax and rib fractures. Chest abnormalities were classified based on the Fleischner Society glossary [28]. Standard grey-scale (also called "white bones") and inverted grey-scale ("black bones") CXRs were evaluated in two separate reading sessions, as follows: • Session 1: standard setting first, followed by inverted grey-scale • Session 2: inverted grey-scale first, followed by standard.
There was a wash out interval of at least 4 weeks between the two reading sessions, and images were evaluated in random order. For each session, annotation of findings was recorded separately for standard and inverted grey-scale to analyse the findings by either first line standard or inverted. Subsequently, the adjunct findings by consecutive reading with either approach were recorded. This database allowed testing of CXR accuracy and interobserver agreement under different reading settings and combinations (see Statistical analysis). Reading time was recorded for each reader and session.
Standard of reference-The diagnostic performance of CXR with different visualization modes was tested against CT, as standard of reference. CT images were reviewed independently by two resident radiologists (Readers 4 and 5, respectively) who had access to the radiological reports, and classified into positive or negative, as follows: • Positive CT was assigned in case of at least one of the eight above-mentioned findings; • Negative CT was assigned when none of them was present.
Any discrepancy between Readers 4 and 5 was resolved by a chest radiologist with 13 years of experience. The same classification system was applied to discretize CXR outcome in binary categories.

Statistical analysis
Continuous data were expressed as median and its 95% confidence interval (95% CI), whereas categorical data were expressed as absolute and relative distribution, with corresponding 95% CI using Wilson method.
The following reading settings were assembled for comparison with CT standard of reference: • Setting 1: standard reading only, derived from session 1 • Setting 2: inverted reading only, derived from session 2 • Setting 3: combined reading, first standard followed by inverted reading as per full session 1 • Setting 4: combined reading, first inverted followed by standard reading as per full session 2 Sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) were calculated for each reader and all reading settings; accuracy was tested with the area under the curve (AUC) values and its 95% confidence interval (95% CI). Interobserver agreement was tested by Cohen's K test with quadratic weights (k w ) and its 95%CI: k w < 0.20 was considered to indicate poor agreement, 0.21 < k w < 0.40 fair agreement, 0.41 < k w < 0.60 moderate agreement, 0.61 < k w < 0.80 good agreement, and 0.81 < k w < 1.00 very good agreement.

Interobserver agreement
K w values for any finding ranged 0.23-0.63, 0.13-0.73, 0.21-0.66, 0.14-0.75 for settings 1, 2, 3 and 4, respectively. Regardless of size, interobserver agreement at the detection of pneumothorax between the residents and the senior radiologist showed a slight improvement in both settings 3 and 4 (Table 3). K w values were generally higher for large pneumothorax-sized ≥ 3 cm [29]-with only two exceptions of greater values observed for small pneumothoraces (sized < 3 cm). Details are reported in Table 4.

Discussion
We observed that grey-scale inversion display mode did not significantly improve diagnostic performance or interobserver agreement compared with standard viewing mode.
Combinations of standard and inverted modes could help in reducing the interobserver variability across different levels of expertise.
The visualization of CXR is usually performed by "white bones" mode on video-terminal; however, the perception of CXR images is also (variably) preferred with "black bones" mode. The latter represents a subjective adaptation of the standard setting, based on the individual feeling that the detection of abnormal findings is eased by the inverted images. We undertook this study for systematic evaluation of such perception and showed that there is no actual diagnostic difference. Our results partially confirm previous observation from Park et al. who investigated sensitivity and accuracy of the grey-scale inversion technique, limited to the detection of rib fractures. Park reported that the combination of the two reading modalities could improve chest radiography sensitivity and accuracy among residents and medical students, namely among readers with limited experience [27]. In our study, the combined use of the two approaches increased CXR sensitivity at the detection of consolidation for one resident, when using reading setting 4 (i.e. first inverted, followed by standard), pneumothorax and rib fractures in setting 3 (i.e. first standard, followed by inverted) and 4. However, the improvement did not reach statistical significance for accuracy performance.
Interobserver agreement at the detection of pneumothorax between the residents and the senior radiologist showed a moderate improvement in both sessions and, as expected, was generally higher for large pneumothoraces in all settings and among all readers. Since the required reading time for both sessions was relatively short (not greater than 84 s), the combined use of the two display modes might be worth exploiting when pneumothorax is suspected. Having said that, pneumothorax was scarcely represented among the enrolled patients (5.5%, 28 cases).
The combined reading approach improved the PPV at the detection of pleural effusion by the senior radiologist, but showed a general drop in diagnostic performance as compared to the standard approach for the same reader. The unfamiliarity with the "black bones" images might have affected their interpretation by the senior radiologist. As pointed out by McMahon et al., when a new type of image results in lesser accuracy, the unfamiliarity with the new approach must be taken into account prior to blaming intrinsic properties of the new modality [11]. This "unfamiliarity effect" tends to have a minor impact on younger author radiologists, who are inevitably less affected by a long-lasting habit. Thompson et al. reported that two display modes can improve nodule detection [26]. These authors hypothesized that the advantage of using two display modes might lie in the fast-flicking between the two images, namely standard and inverted, which would draw attention to suspicious areas, (e.g. lung periphery). This fast-flicking technique was not employed by our readers, for whom the detection was already slightly improved, suggesting that it might only partly explain the advantages of such a combination. Even if limited in number, the majority of studies that have applied the grey-scale inversion display mode to chest radiography have attempted to demonstrate its additional value in detecting lung nodules, either real or simulated, with opposite results [17,[22][23][24][25][26]. Nodules were fairly represented in our sample (16.8%, 85 cases), and significant differences were not observed in accuracy or interobserver agreement with the combination of the two techniques. Their depiction rate was generally low among the three readers, ranging 7.1% to 17.7%. One of the reasons of such low percentages can be found in their relatively small size (nodule median diameter of 7 mm, 95%CI, 6 to 8 mm), which has likely contributed to reduce their detectability by CXR. Previous studies reported better performance in nodule detection, notably with relatively larger solid nodules [17]. As opposite to previous analyses, a nodule size range was not set at the time of patient selection (22), since the general intent of this investigation was to reproduce a real clinical setting, without focusing on a pre-defined finding.
To our knowledge, this is the first study testing eight different abnormal findings at the same time and within such a large population. Indeed, the majority of studies that have investigated the application of grey-scale inversion display mode into chest radiography only tested one selected finding at time, enrolling no more than 300 subjects. Furthermore, we included bedside CXRs, with the aim of reproducing a real clinical setting, where a good proportion of patients is unable to stand (e.g. trauma patients or severely ill ones). Of note, the effect of reading setting was comparable for both standing and supine CXR imaging.
Our study, however, has several limitations. First, the retrospective design is prone to confounding factors, such as selection of patients. Second, CXRs were obtained with different technical equipment and parameters, which can ultimately affect the detectability of findings, nonetheless representing the actual routine of this imaging modality. Third, some of the findings included in the analysis were barely represented within the sample, such as mass (1.97%, 10 cases). Finally, the presence of only one senior radiologist limited the possibility of investigating the impact of different levels of expertise.
In conclusion, we observed no significant advantages in the use of grey-scale inversion technique in expert radiologist. The combination of grey-scale inversion display modes with standard mode could reduce the interobserver variability in readers with limited expertise.
Authors' contributions MS, SN and NS were involved in the conception and design. NM, CS and CB contributed to the provision of study materials. NM, CS, CB and REL contributed to the collection and Table 4 Interobserver agreement between the three Readers for small and large pneumothoraces Reader 1 vs.   assembly of data. REL and MS were involved in the data analysis and interpretation. REL, MS, SN and GM contributed to the manuscript writing. All authors contributed to the final approval of manuscript.
Funding This research did not receive any specific grant from funding agencies in the public, commercial or not-for profit sectors.
Availability of data and material All data generated or analysed during this study are included in this published article and its supplementary information file.

Conflict of interest
The authors declare that they have no conflict of interest.
Ethical approval This study was approved by the Institutional Review Board of the University Hospital of Parma (Prot. 51059).

Consent to participate
Given the retrospective nature of the study, informed consent was waived.

Consent for publication Not applicable.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.