Introduction

Lung cancer is a global, chronic disease with a poor prognosis. The tumor–lymph node–metastasis (TNM) staging system is the most commonly used and accurate prognostic model [1], and patients may experience enhanced treatment results after obtaining the suitable treatment based on the stage. To accurately ascertain the TNM stage, patients must undergo a range of tests, such as histopathological tests, CT scans, MRI scans, and/or PET-CT scans [2]. In order to take these examinations, patients must fulfill certain criteria depending on their physical condition. Prolonged investigations in a clinical setting can often be a challenge for both patients and medical professionals, as they can last anywhere from a week to a month.

The accuracy of TNM staging in diagnosing is estimated to be around 70% [3], which is insufficient to meet the demands of clinical practice; thus, researchers are endeavoring to supplement it with easily available data. Routine blood test is distinct from other clinical examination procedures because of their ease, speed, repeatability, and capacity to track alterations over time [4, 5]. The aforementioned attributes render it a crucial factor in the diagnosis and prediction of numerous diseases, including the current COVID-19 pandemic [6]. It has been demonstrated that certain features obtained from routine blood tests, such as the Neutrophil-to-Lymphocyte Ratio (NLR), the Glasgow Prognosis Score (GPS), and the Systemic Immune-Inflammation Index (SII), can be used to predict cancer prognosis [7,8,9].

Researchers have discovered and continue to discover numerous complex blood features. To differentiate between the original features and the derived complex features, we can refer to them as low-order features (LOF) and high-order features (HOF) respectively. The LOF have an established naming system for their abbreviations, such as WBC (White Blood Cell Count), CRP (C-Reactive Protein), RBC_SD (RBC Distribution Width Standard Deviation), and MPV (Mean Platelets Volume). However, no such systematic system exists for HOF abbreviations, which has caused confusion in the utilization of the abbreviations of these HOFs in existing reports. For instance, the calculation formula of Systemic inflammatory marker (SIM) [10] and Systemic inflammation response index (SIRI) [7], the lung immune prognostic index (LIPI) [11] and the dNLR combined with LDH index (LNI) [12], Onodera’s prognostic nutritional index (OPNI) [13] and (Prognostic nutrition index) PNI [14] are identical, whereas the only distinction between Glasgow prognostic score (GPS) [9] and modified Glasgow prognostic score (mGPS) [14], the systemic inflammation score (SIS) and modified SIS (mSIS) [15] is their cutoff values. Moreover, the values of Lymphcyte-to-Monocyte ratio (LMR) [7] and Monocyte-to-Lymphcyte ratio (MLR) [16], Fibrinogen-to-Albumin ratio (AFR) [17] and Albumin-to-Fibrinogen ratio (FAR) [18] are inversely proportional to each other, yet their importance for prognosis remains the same when it comes to data analysis. What is more, naming with single-letter abbreviations can lead to conflicts, GLR is used as an abbreviation for Gran/Lymph [19], GGT/Lymph [20] and Glc/Lymph [21] in different documents, while LLR is an acronym for both WBC/Lymph [14] and LDH/Lymph [23].

The primary objective of this paper is to introduce the concept of high-order blood HOF and to conduct a thorough investigation of existing literature to determine its potential in predicting NSCLC prognosis.

Methods

Document retrieval

In order to identify as many blood HOFs as possible, a comprehensive search of articles published between January 2018 and October 2022 on PubMed was conducted. The search query was: ((“Blood Cell Count“[MeSH Terms]) OR (Complete Blood Count[Title/Abstract]) OR (“Laboratory Tests“[Title/Abstract]) OR (blood routine[Title/Abstract])) AND ((“Risk Factors“[MeSH Terms]) OR (Prognosis[MeSH Terms]) OR (Biomarkers[MeSH Terms])) AND ((complex index[Title/Abstract]) OR (ratio[Title/Abstract])) AND ((Cancer[MeSH Terms]) OR (Inflammation[Title/Abstract])) NOT ((review[Title/Abstract]) OR (Meta-analysis[Title/Abstract])).

Patients

This research included 1,423 individuals who had been identified with lung cancer and were admitted to the Sichuan Cancer Hospital between 2015 and 2017. In line with the Chinese Medical Association’s clinical diagnosis and treatment guidelines for lung cancer [24], the treatment options for all patients were determined according to the same guidelines. This study excluded patients who had not been diagnosed with primary lung cancer or had a combination of other primary carcinomas, lacked blood test data prior to treatment, or had received anti-tumor therapy in other hospitals.

Data collection

This study was granted approval by the Medical Ethics Committee of the Sichuan Cancer Hospital (SCCHEC-02-2021-064). Clinical and laboratory data of the patients were retrospectively obtained; histological examination was employed to verify the pathological type; and the American Joint Committee on Cancer (AJCC) Eighth Edition staging system [25] was utilized for tumor staging. The LOFs and HOFs that we used are covered in Additional Table 1. The LOFs comprise reference intervals that are considered normal.

The final follow-up, conducted in May 2021, measured overall survival (OS), which is the time from diagnosis to death caused by any cause or loss to follow-up.

DeepSurv

To analyze both linear and nonlinear data, the DeepSurv algorithm [26], which is based on deep learning, can be employed to predict the probability of death for a particular patient. This algorithm was implemented using Python 3.7.6; for further information on the method and project code, please refer to the references [3, 26].

The input layer was set to the same dimensionality as the input data, while the three hidden layers comprised of 512, 1024, and 512 neurons respectively, and the output layer was one neuron. The experiment was trained for 500 epochs with an initial learning rate of 0.067, Adam optimizer, a decay rate of 0.06494, a discard layer loss rate of 0.2, and an L2 regularization coefficient of 0.005. The reliability of the model was evaluated using five-fold cross-validation.

To increase the DeepSurv model’s interpretability, the Shapley Additive exPlanations (SHAP) [27] approach is being utilized. The estimated importance of the features for the model was determined by using the SHAP method. For each patient, the DeepSurv model generated a predicted risk value, and a SHAP value was assigned to each feature of the patient, demonstrating the influence of each feature on the model’s output risk value.

Statistical analysis and plotting

Statistical analysis was conducted using R version 4.0.2 (2020-06-22). Spearman’s method was applied to assess the correlation between features. The patient characteristics were generated with the help of the package “TableOne”. The “ggDCA” package was used to create decision curve analysis (DCA) curves, and “rms” nomogram was used to generate nomogram and calibration curves. Concordance index (C-index) values were used to compare the prediction and true values.

Results

HOFs categorize

After conducting a literature screening strategy, 1558 articles were identified, of which 1210 were suitable for analysis after filtering out those deemed unsuitable based on title and abstract screening. Through manual reading of the literature, we screened 160 HOFs and then merged them into 110 according to their calculation formula. This suggests that HOFs can be classified into three groups according to calculation method: basic proportional type (e.g. NLR and LMR), composite type (e.g. derived NLR (dNLR) and PNI), and scoring type based on the first two types (e.g. GPS and LIPI). Within this study, we identified 76 proportional (Table 1), 6 composite (Table 2), and 28 scoring (Table 3) HOFs, respectively.

Table 1 Proportional HOFs
Table 2 Composite HOFs
Table 3 Scoring HOFs

To avoid similar issues in the future, we have proposed a set of rules for the naming of blood HOFs, with the aim of providing researchers with a consistent and accurate nomenclature. These rules include, but are not limited to:

  1. 1.

    Preference should be given to the abbreviations reported in Tables 1, 2 and 3 of this article, and it is advised to use the abbreviations with more reports in the left column, rather than the reverse proportional mode on the right.

  2. 2.

    Use abbreviations of terms related to clinical relevance. Although both the Lung Immune Prognostic Index (LIPI) and the dNLR combined with LDH index (LNI) are identical, it is suggested to use the LIPI due to its more accurate representation.

  3. 3.

    The product type feature is denoted by the initial letter of the feature, whereas the proportion type feature is indicated by the combination of the initial letter and the suffix ‘R’, indicating Ratio. For example, LA stands for the product of lymphocytes and albumin, while LAR is the ratio of lymphocytes to albumin.

  4. 4.

    The nomenclature of proportional features shall be based on the order of obtaining the ratio that is greater than one. It is important to note that multiplying the coefficient should be avoided when adjusting the value, as this will not alter the significance of this feature in data analysis.

  5. 5.

    In the event of a clash in naming with a single acronym, the second letter or full name of the conflicting feature should be employed. When the abbreviation of GLR is unclear, GlcLR can be used to signify the Glucose-to-Lymphocyte ratio and GranLR for the Granulocyte -to-Lymphocyte ratio.

  6. 6.

    It is advisable to limit the number of abbreviated names to between 3 and 6 characters to avoid confusion.

Patient characteristics

Following the acquisition of HOFs, we immediately collected patient data for validation. The cohort included 1423 individuals with NSCLC, with 945 having adenocarcinoma and 478 having squamous cell carcinoma. At diagnosis, the majority of patients were in the later stages, with 482 in stage III and 595 in stage IV. Approximately 36% (51/1423) of the patients were either current or former smokers. The number of men (945) being almost double that of women (478). The median age was 62 years (IQR: 52–67), median follow-up was 499 days (IQR: 189-1162.5). Upon follow-up, 675 (47.4%) patients had died. The baseline characteristics of the study cohort are outlined in Additional Table 2.

Correlation analysis

Having thoroughly explored the reported HOFs, we proceeded to investigate whether there is any correlation between each blood feature and other clinical characteristics, including sex, age, staging, smoking status, and pathological type, in order to gain further understanding. It should be noted that, as many patients in our cohort did not have a blood biochemical test prior to treatment, the HOFs in Tables 1, 2 and 3 cannot be included in the analysis (Additional Table 1). To carry out a correlation analysis, we evaluated patients based on the other four parameters. The screening criteria and the features of the patients who meet the criteria are outlined in Table 4. The Spearman method was utilized to conduct correlation analysis, with a confidence interval of 0.95. All groups, except for the smoking group, consisted of 60 patients, and the correlation coefficient threshold (Rs) was set at an absolute value of 0.305. The smoking group comprised 52 patients, with a Rs of 0.321. The analysis results indicate that sex is associated with MCHC and WRPI in LOFs. The calculation formulas for the two HOFs that are related to Age already include Age, rendering their significance insignificant. Smoking can lead to an increase in neutrophils and a decrease in albumin, with a greater impact on HOFs. There was no observed correlation between blood features and pathological types. It is noteworthy that no significant correlation was observed between any LOFs and stage, but after high-order transformation, eight features were found to be related to stage. The most highly correlated feature is GGLR (GGT/Lymph), with a correlation coefficient of 0.4041. All the characteristics that exhibit correlation coefficients greater than Rs are grouped together in the final row of Table 4.

Table 4 The screening criteria for correlation analysis and significant results display for each group

DeepSurv Analysis

To evaluate the importance of LOF and HOF data on the prognosis of lung cancer patients, models were constructed with DeepSurv algorithm and the prediction accuracy was measured by C-index. The Table 5 shows that the LOF model is relatively stable, with C-index values in the train set and the test set not significantly different. On the other hand, the HOF model can achieve a C-index value of more than 0.7 on the train set, which is comparable to the effect of staging. However, maybe due to high correlation among many HOF features, it is prone to overfitting, thereby performing poorly in the test set. Given that age, sex, and smoking status are readily available data that can be conveniently gathered during routine clinical assessments, we have grouped these three variables together as ASS. The addition of ASS (Age + Sex + Smoking) features does not enhance the prediction model’s performance significantly.

Table 5 The C-index of LOF- and HOF-based DeepSurv models

To evaluate the impact of each feature in the model, we have utilized SHAP algorithm for visual analysis. As illustrated in Fig. 1, feature value reflects the real value of each feature, and SHAP value reflects the contribution to the individual prognosis model, with a negative value indicating a negative contribution. Figure 1 A reveals that WBC, MPV, Mono_ratio, Baso_ratio, and Lymph_ratio are the five features that have the most significant influence on the LOF model; an increase in WBC and Mono_ratio values is associated with a poor prognosis, whereas the other three have the opposite effect. Figure 1B indicates that among the top five most important features, a rise in FIB4, GlcLR and Neu values is linked to a negative prognosis for patients, whilst the MPVLR and BLR are the opposite.

Fig. 1
figure 1

The top 20 important features in LOF- and HOF-model chosen by SHAP algorithm. A: LOF model. B: HOF model

Model comparison

The previous analysis leads us to believe that the risk value output by the DeepSurv model can be used to supplement the staging system, thereby improving the prediction efficiency. To more clearly illustrate the comparison of the prediction effects of each data combination, a DCA decision curve was utilized. The decision curve employs a horizontal axis labeled as risk threshold, with the “none” horizontal line signifying that patients are devoid of any risk. The model’s net benefit is zero in this scenario. However, if all patients are at risk, the net benefit takes the form of a negative slope backslash, as depicted by the “All” line.

As illustrated in Fig. 2, the risk prediction ability of HOF model is superior to that of LOF model, and the addition of ASS features can enhance the prediction efficiency of both models. However, the feature combination of DS_LOF + DS_HOF + ASS was not as effective as that of Stage + Pathotype in terms of prediction efficiency. All features (Stage + Pathotype + DS_LOF + DS_HOF + ASS)combined can provide the best prediction efficiency. It can evident that blood features can be employed as an additional factor in forecasting the risk of lung cancer patients.

Fig. 2
figure 2

The DCA curve of different feature combinations. “DS_” means the output risk value of DeepSurv model. ASS present the combination of Age + Sex + Smoking features. All feature contains Stage + Pathotype + DS_LOF + DS_HOF + ASS features

Nomogram Model

Finally, a nomogram was established based on these features to obtain a more intuitive prognosis model. Figure 3 A shows that stage is still the most significant prognostic factor, followed by DS_ HOF, age, DS_ LOF, sex, pathological type and smoking status. The C-index of the model is 0.744 and the calibration curve, as seen in Fig. 3B, demonstrates its good predictive effect on lung cancer patients in 1 year and 3 years.

Fig. 3
figure 3

The Nomogram on OS and calibration curve of the final prognosis model. A: Nomogram for 1-, 3-, and 5-year OS. B: Calibration curve of nomogram predicting 1-, 3- and 5-year OS

Discussion

To sustain the exploration of HOFs with clinical application value and further deepen this research direction of blood test data, a sustainable expansion system needs to be established. This is the first systematic review of the blood HOF, which aims to sort and classify the existing HOFs, and to propose rules for their nomenclature.

Tables 1, 2 and 3 demonstrate that the main direction of HOF mining is to acquire features from inflammation and nutrition, such as NLR, SII, GPS, SIS and other significant HOFs which are all based on Neu, CRP, Lymph, Alb, Plt. However, for early cancer patients, their nutritional and inflammatory status may not serve as a crucial indicator. Therefore, it is suggested to start from the viewpoint of the pro- and anti-tumor balance. Tracking the changes during the treatment process could help to identify such features quickly [85]. Previous research has demonstrated that the alterations of NLR throughout treatment have a more reliable prognostic value for patients than NLR at a single point in time [79, 119].

The correlation analysis findings reveal that low-order features have little correlation with clinical features, whereas a multitude of high-order features demonstrate a correlation with clinical features. This implies that high-order features hold substantial clinical significance in cancer diagnosis and treatment. In terms of medical applications, MLR can provide insights into the likelihood of prostate cancer [16], while NLR and PLR can be utilized to predict chemotherapy response [114] and the potential for metastasis [82]. Additionally, LWR and MWR have proven to be effective in forecasting the prognosis of gastric cancer [58].

Despite numerous reports of HOFs, the clinical significance of most of them remains uncertain and the interpretability is still unsatisfactory. This study proposes that the output risk value can be utilized in addition to the staging information to optimize the prognostic efficiency, demonstrating that this usage is possible. Despite the integration of the SHAP algorithm, the inexplicable of deep learning remains unresolved. We can only ascertain the influence of the chosen features on the model’s formation, yet the weight and calculation process of each feature remain unknown.

Conclusion

This paper’s most remarkable achievement is the sorting of reported blood HOFs, which can be used as an index for further research, and a systematic evaluation of its prediction of OS in NSCLC. However, there may still be many HOFs that have not been retrieved and included, and there is no systematic scheme for the subsequent blood HOFs mining, which will be the main goal of the research group.