Introduction

The seventh most prevalent type of cancer worldwide and the ninth most common type of cancer in the USA, head and neck cancer (HNC), refer to a variety of upper aerodigestive tract tumors [1]. Around 644,000 new HNC cases are projected to be diagnosed annually worldwide, with two thirds of these occurrences taking place in the developing nations [2]. The American Joint Committee on Cancer defines HNC as a tumor originating from both major and minor salivary glands as well as malignancies coming from mucosal areas of the oral cavity, larynx, paranasal sinuses, and pharynx [3]. Important risk factors for HNC include smoking drinking alcohol, being overexposed to sunlight, gamma, and ultraviolet radiation, having cancer in the family [4]. Additionally, human papillomavirus (HPV) has been linked to oropharyngeal cancer (OPC), which is a type of HNC. This type of cancer linked to HPV makes up approximately 25% of all HNCs [5]. The National Comprehensive Cancer Network (NCCN) recommends HPV testing for all oropharyngeal tumors in their guidelines [6]. In the USA, the percentage of head and neck cancers diagnosed as OPC that tested positive for HPV increased from 16.3% in the 1980s to more than 72.7% in the 2000s. This seems to be a result of increased awareness, the discovery of the link between HPV and cancers of the head and neck, and improved diagnostic HPV testing [7].

Importance of knowing the HPV status

Planning a course of treatment requires knowledge of the HPV status in OPC patients. HPV-positive OPC has a lower mortality rate than HPV-negative illness, with a 60% mortality rate with N3 or M1 sickness and an 80–90% 5-year survival rate even with lymph node involvement. Both overall mortality from all causes (10.4% vs. 33.3%) and mortality primarily from head and neck cancer (4.8% vs. 16.2%) are lower for HPV-positive patients. Despite having a poor 5-year survival rate of about 67%, OPSCC, which tests negative for HPV, has a poor prognosis [8, 9]. Therefore, knowing a patient’s HPV status aids in identifying those who may have a better prognosis and may not need an aggressive course of therapy. According to studies, radiation and chemotherapy treatments have a stronger tendency to decrease and control HPV-positive tumors’ growth. Contrarily, HPV-negative cancers might be more resistant to conventional therapies, requiring more intensive or unconventional therapeutic modalities [10]. Additionally, there is significant interest in de-escalating treatment intensity for patients with HPV-positive oropharyngeal cancer in order to reduce treatment-related toxicities while preserving outstanding results due to the favorable prognosis of this kind of disease [11]. Identifying patients who may be candidates for treatment de-escalation methods, such as lowering radiation dosage or chemotherapy intensity, is made easier with the use of HPV status information. This strategy aims to achieve the best possible compromise between reducing adverse effects from the treatment and controlling the tumor effectively [12]. Eligibility for particular clinical trials and targeted therapy is influenced by HPV status in oropharyngeal cancer. For individuals who are HPV-positive, several studies and cutting-edge treatments explicitly targeting HPV-related biological pathways may be beneficial. Clinicians can find suitable clinical trial choices and possibly investigate targeted treatments based on the underlying genetic features of the tumor by determining the HPV status [13]. The presence of HPV may also affect post-treatment surveillance plans. A more targeted and unique approach to post-treatment monitoring is made possible by modifying the surveillance procedures based on HPV status [14]. In the end, determining HPV status enhances the accuracy of therapy selection and helps to improve patient outcomes in oropharyngeal cancer.

Diagnostic and characterization methods

For oropharyngeal cancer, a variety of diagnostic and characterization techniques are currently available and in use [15]. Rapid diagnosis and treatment increase a patient’s likelihood of recovering from an illness [16]. Oropharyngeal tumors are routinely diagnosed and characterized using a variety of traditional techniques, such as physical examination, imaging studies, and tissue samples [17]. A physical examination will be performed, during which a medical practitioner will carefully inspect the head, neck, and oropharynx. Any abnormal growths or other symptoms that might point to the presence of a tumor will be examined by medical professionals [18]. The oropharyngeal region can be seen, and tumors can be detected using a variety of imaging modalities such as magnetic resonance imaging (MRI), computed tomography (CT), positron emission tomography (PET), and endoscopy [19]. For establishing the existence of a tumor and identifying its features, biopsy tissue sampling is essential. An oropharynx biopsy entails the removal of a tiny sample of tissue, which is then sent to a lab for examination. HPV molecular testing is possible on this tissue [20]. Conventional laboratory techniques like p16 immunohistochemistry (IHC) and polymerase chain reaction (PCR) can be utilized to determine the HPV status of a patient. The 8th edition of the American Joint Commission on Cancer (AJCC) recommended P16 IHC as a diagnostic test for oropharyngeal cancer staging [21]. However, this may increase the effort and cause a delay in clinical applications. It is crucial to remember that the diagnostic procedure can change based on the particular instance, the preferences of the healthcare professional, and the accessibility of resources. To guarantee a precise diagnosis and thorough understanding of oropharyngeal tumors, a multidisciplinary strategy comprising numerous specialists, such as otolaryngologists, radiologists, and pathologists, is frequently used.

A new alternative: radiomics

In addition to all of the classic diagnostic techniques mentioned above, new technique known as radiomics has recently attracted interest. The field of radiomics has advanced swiftly toward practical application in the hopes of improving cancer treatment and accurate detection. Radiomics is a quantitative approach to medical imaging that makes use of sophisticated mathematical analysis to enhance the data already available to doctors [22]. Radiomics quantifies textural information by mathematically extracting the spatial distribution of signal intensities and pixel interrelationships using analysis methods from the field of artificial intelligence [23]. Due to their potential prognostic value for treatment outcomes radiomic features have recently gained a lot of attention and may be useful in personalized medicine.

The application of radiomics in oropharyngeal malignancy diagnosis and precision medicine has shown promise since it enhances characterization and diagnosis. This improved characterization can help in making the distinction between benign and malignant tumors, determining how aggressive the tumor is, and guiding treatment choices [22]. Radiation oncologists can optimize radiation dose distribution and intensity modulation to target the tumor more precisely while preserving healthy tissues by incorporating radiomic characteristics into treatment planning algorithms. Radiation therapy when guided by radiomics attempts to provide a more individualized type of care that maximizes tumor control while minimizing adverse effects [24].

Radiomics can also help in calculating the likelihood of metastasis, disease recurrence, or overall survival, which can inform treatment choices and follow-up plans [25].

In some circumstances, radiomics may be used in addition to or in very exceptional cases as a substitute for biopsy [26]. Radiomics can also be used to track a cancer patient’s reaction to treatment. The success of treatment and tumor development or regression may be determined by tracking changes in radiomic characteristics over time [22]. In summary, the radiomics technique has the ability to capture complex tumor properties, in addition to what can be learned from a biopsy sample alone [26].

By identifying the most questionable or important locations for the sample, radiology can help direct the biopsy procedure. Radiomics can increase the precision and diagnostic yield of biopsies by examining radiomic characteristics, imaging-based biomarkers, or tumor heterogeneity patterns, ensuring that the most representative tissue samples are obtained [27].

Biopsy and radiomics are two different methods that have advantages and drawbacks. Radiomics’ non-invasive nature is undoubtedly one of its advantages. Both geographically and temporally diverse tumors exist in solid tumors. As a result, the use of invasive biopsy-based molecular assays is constrained.

Radiomics, which may non-invasively detect intra-tumoral heterogeneity, now has a wide range of applications [28].

In summary, radiomics provides a non-invasive whole-lesion assessment with the possibility of temporal monitoring and multi-dimensional analysis. However, it is deficient in tissue confirmation and thorough histological data.

Contribution

In the literature, a number of algorithms have been proposed for radiomics-based HPV status determination [29,30,31,32,33,34,35,36,37,38,39,40,41]. This work is different in that the developed method is based on a not-yet-tried machine-learning algorithm combination. Additionally, a thorough comparison of our approach and results with the biopsy technique was conducted for the first time.

The advantages and disadvantages of each approach and typical error rates encountered are given. Finally, the future work needed to translate the non-invasive radiomics approach to routine clinical practice is outlined.

Material and methods

Data set

Four hundred ninety-five patients in the collection of data from The Cancer Imaging Archive (TCIA) [42] were included in the study. The study only examined 238 individuals’ contrast-enhanced CT scans, 204 of whom were HPV positive and had been given an OPC diagnosis. The gross primary tumor volume (GTVp) (see Fig. 1), which is divided by experts, is taken into account in radiomic research [35].

Fig. 1
figure 1

One slice of CT image of a patient with OPC. Green-colored segmentation represents the tumor area

Image pre-processing

Prior to the radiomic feature extraction procedure, all patients’ CT images were resampled and normalized. Resampling CT images produced 1 mm × 1 mm × 1 mm voxels [43]. The three-dimensional (3D) slicer program (version 5.0.3) was used to do resampling and interpolation methods using the Python-based pyradiomics [44].

Feature extraction

After performing image preprocessing, the feature extraction process was carried out using 3D Slicer software (version 5.0.3) [45]. A standard bin width of 10 was set in order to implement gray-level discretization and reduce variability [46]. These included characteristics of the original images, wavelet-transformed images, and Laplacian of Gaussian (Log)-filtered images (see Fig. 2). After the images have been converted into features, the newly generated data were utilized to train and evaluate machine learning (ML) models. These characteristics can be used to perform quantitative image comparisons [47]. van Griethuysen et al. [48] have detailed explanations of the radiomics technique.

Fig. 2
figure 2

Feature extraction process

Data pre-processing and resampling

The 1142 features were subject to Z-score normalization. Twenty percent of the data was designated for testing, while the remaining 80% was designated for training. There was an uneven distribution of HPV classes between the training set and the test set. The test set contained 48 instances (43 HPV positive, 5 HPV negative), while the training set contained 190 cases (161 HPV positive, 29 HPV negative).

Since the number of cases with HPV status was imbalanced, the random over-sampling (ROSE) [49] resampling method was utilized. Only the training set was subjected to a resampling technique, and 161 samples for each positive and negative HPV class were obtained (see Fig. 3).

Fig. 3
figure 3

Random over-sampling application on training data

Feature selection

Radiomics approaches typically produce high dimensional data, which increases the risk of over-fitting, worsens model confusion, and degrades prediction accuracy. In this research, correlation coefficient analysis (CCA), random forest (RF) feature importance analysis, and backward elimination methods were used in order to choose functional features (see Fig. 4). CCA was initially employed as a filter-based technique to subtract unneeded characteristics that were extremely closely related (absolute correlation coefficient > 0.9). The Gini impurity metric, which offers a better way to gauge feature importance, was used to run the random forest model [50].

Fig. 4
figure 4

Flowchart for the feature selection process

The fifty most crucial traits were chosen using the sequential backward selection approach, with the k-nearest neighbor serving as a forecaster. The feature selection techniques were carried out in Python (version 3.9) using the MLxtend and Scikit-learn libraries [51].

Results

Model training and evaluation

A random forest (RF) classification ML algorithm was used as a prediction ML model utilizing five-fold cross-validation (CV). Five hundred different combinations of hyperparameters were tried on the RF model utilizing the randomized search CV method on training feature sets to determine the best ones that maximize model efficiency. This operation (five-fold nested cross-validation) was run five times for various training and testing sets. On the 20% of the initial unbalanced data that was not used, the model’s performance was assessed separately. HPV status was predicted by an RF algorithm with an accuracy of 91% (95% CI 83–99) and an area under the curve (AUC) of 0.77 (95% CI 65–89) on the test data. The confusion matrix and ROC curve (receiver operating characteristic curve) for the random forest model with ROSE resampling algorithm performance result on the test data are demonstrated in Figs. 5 and 6, respectively.

Fig. 5
figure 5

Confusion matrix of random forest model with the ROSE re-sampling algorithm

Fig. 6
figure 6

ROC curve of random forest model with the ROSE re-sampling algorithm

Discussion

The objective of this paper was to develop a new radiomics-based solution to the problem of HPV determination for OPC patients and make a thorough analysis of its applicability in routine clinical practice.

The RF algorithm in combination with resampling allowed us to identify the HPV situation with an accuracy of 91% (95% CI 83–99) AUC of 0.77 (95% CI 65–89) on the independent test data.

The RF algorithm has several known advantages such as good predictions that can be understood easily and a higher level of accuracy with respect to decision trees. It can also handle large datasets that may be available in the future. Other algorithms and different datasets have been investigated in several previous studies [29,30,31,32,33,34,35,36,37,38,39,40,41]. In the present study, the results were comparable to previous findings although the data size was small and highly imbalanced. Another limitation is that no testing has been done to determine the sensitivity of radiomic characteristics with respect to segmentation alterations. The results have been obtained using an independent test dataset but not have been verified with data obtained from other institutions.

Comparison with biopsy and future work for widespread application

Radiomics and biopsy can be complementary to each other. Some of the shortcomings of the biopsy are missing the most aggressive or representative areas of the tumor [52] and failing to detect cancer cells in the sample because the tumor can be small or located in a challenging anatomical site [53]. Furthermore, there can be inter-observer variability, where different pathologists may interpret the same biopsy sample differently, leading to variations in treatment decisions [54]. Biopsy samples may not fully capture the complexity of the tumor, including variations in genetic mutations, protein expression, or cellular characteristics. Besides, biopsies, especially those performed using invasive techniques such as surgical excision or fine needle aspiration, carry some risks and potential complications. These can include bleeding, infection, damage to nearby structures, and patient discomfort [55]. In [54], the sample error accounted for 60.0% of inconsistent findings, and pathologist inconsistency accounted for 23.3%. The error rates can change depending on the specific biopsy procedure and the condition being evaluated.

It is important to acknowledge these limitations when interpreting biopsy results in oropharyngeal cancer. Clinicians may want to consider a complementary approach such as radiomics in order to obtain additional evidence, in particular when biopsy conditions are not optimal.

Radiomics has also its own disadvantages. There is a lack of standardized protocols and guidelines for feature extraction, leading to variability in the methods used across different studies and institutions. This lack of standardization can impact the reproducibility and comparability of radiomic results, making it challenging to establish consistent and reliable radiomic models. Radiomics heavily relies on the quality and consistency of the medical images used for analysis. However, imaging techniques, acquisition parameters, and equipment can vary between institutions, scanners, and even individual radiologists. These variations in image acquisition can introduce variability and bias in the radiomic features, affecting the accuracy and generalizability of the results. Moreover, radiomics relies on the availability of large and diverse datasets for training and validation purposes. However, obtaining high-quality imaging data with corresponding clinical annotations can be challenging due to issues such as data privacy, limited sample sizes, and variations in data collection across institutions. Limited data availability and potential biases in the data can impact the development and validation of robust radiomic models [56]. Furthermore, while radiomics studies have shown promising results in research settings, there is a need for robust validation and clinical translation. The performance of radiomic models in real-world clinical settings may differ from the initial research findings. Further validation studies, preferably in multi-institutional settings, are needed to establish the clinical usefulness and effectiveness of radiomics in various disease contexts. The radiomics results should also provide a probability for the likelihood of a correct result for a particular patient. Also, radiomic models often provide quantitative and statistical measures, but the interpretation of these measures and their integration into clinical decision-making can be challenging [27]. The clinical relevance and meaningfulness of radiomic features need to be further explored and validated to ensure their utility in guiding treatment decisions and patient management. Addressing these reported issues requires ongoing research and collaboration among radiomics researchers, imaging experts, and clinicians. The Image Biomarker Standardization Initiative (IBSI), founded by study participants [57], was created to overcome these difficulties by creating the goals, which are nomenclature and descriptions for frequently utilized radiomic characteristics; a common image processing using radiomics plan for the computation of imaging-based characteristics; and data collection and related reference values for the calibration and testing of image processing software implementations.

Future ML work

In spite that our models are able to forecast HPV status with a good level of AUC and relative accuracy, more research needs to be done utilizing larger clinical datasets to verify the effectiveness of the created ML model. All the above-mentioned concerns about radiomics should also be addressed (interpretability, standardization, reproducibility).

Conclusion

In conclusion, this work demonstrates that it is clinically important and possible to develop a new CT radiomic-based non-invasive complementary solution for the determination of HPV status with accuracy rates that can challenge those obtained from biopsy. However, further research is needed to improve accuracy, safety, standardization, interpretability, and reproducibility for widespread clinical acceptance.