Developing global image feature analysis models to predict cancer risk and prognosis
- 120 Downloads
In order to develop precision or personalized medicine, identifying new quantitative imaging markers and building machine learning models to predict cancer risk and prognosis has been attracting broad research interest recently. Most of these research approaches use the similar concepts of the conventional computer-aided detection schemes of medical images, which include steps in detecting and segmenting suspicious regions or tumors, followed by training machine learning models based on the fusion of multiple image features computed from the segmented regions or tumors. However, due to the heterogeneity and boundary fuzziness of the suspicious regions or tumors, segmenting subtle regions is often difficult and unreliable. Additionally, ignoring global and/or background parenchymal tissue characteristics may also be a limitation of the conventional approaches. In our recent studies, we investigated the feasibility of developing new computer-aided schemes implemented with the machine learning models that are trained by global image features to predict cancer risk and prognosis. We trained and tested several models using images obtained from full-field digital mammography, magnetic resonance imaging, and computed tomography of breast, lung, and ovarian cancers. Study results showed that many of these new models yielded higher performance than other approaches used in current clinical practice. Furthermore, the computed global image features also contain complementary information from the features computed from the segmented regions or tumors in predicting cancer prognosis. Therefore, the global image features can be used alone to develop new case-based prediction models or can be added to current tumor-based models to increase their discriminatory power.
KeywordsMachine learning models of medical images Global medial image feature analysis Cancer risk prediction Cancer prognosis prediction Quantitative imaging markers
Area under a ROC curve
Breast imaging reporting and data system
Computer-aided detection or diagnosis
Correlation-based feature selection
Chronic obstructive pulmonary disease
Epithelial ovarian cancer
Full field digital mammography
Graphical user interface
Magnetic resonance imaging
Non-small-cell lung cancer
Pathologic complete response
Response evaluation criteria in solid tumors
Receiver operating characteristic
Subcutaneous fat area
Synthetic minority oversampling technique
Total psoas area
Visceral fat area
Medical imaging is commonly used in the clinical practice for cancer screening, early detection and diagnosis of tumors, prediction of cancer prognosis, and assessment of tumor response to treatment . However, due to the lack of quantitative assessment tools, subjective reading and interpreting medical images by radiologists are often difficult and generate large intra- and inter-reader variability . As a result, the efficacy of applying medical imaging in cancer screening and prognosis assessment is suboptimal and not robust. For example, although mammography is the most popular imaging technology used in breast cancer screening, its performance is unsatisfactory in terms of both cancer detection sensitivity and specificity . Studies have shown that sensitivity of screening mammography is lower among younger women (i.e., ≤ 50 years old) , those who have dense breasts , undergo hormone replacement therapy , and carry certain breast cancer susceptibility genes . For example, one study reported that mammography sensitivity decreased from 87.0% in women with almost entirely fatty breasts to 62.9% in women with extremely dense breasts or from 83.3% in women aged > 80 years to 68.6% in women aged < 50 years . Thus, a high percentage of mammography-occult breast cancer is missed or overlooked by radiologists in reading screening mammograms. Moreover, mammographic screening generates high recall rates, and the majority of the biopsies are benign , resulting in potential long-term psychosocial consequences in women participating in breast cancer screening . In predicting cancer prognosis or assessing tumor response to treatment, the guideline determined by the response evaluation criteria in solid tumors (RECIST)  used in current clinical practice often does not correlate well to the clinical outcome , which can generate overtreatment resulting in increased mortality and morbidity rates of the cancer patients due to unnecessary toxic side effects or aggressive surgeries .
To improve the accuracy and consistency in reading and interpreting medical images for cancer detection, diagnosis, and prognosis assessment, researchers have been actively working to develop and test computer-aided detection or diagnosis (CAD) schemes since the 1980s, which aim to serve as “the second reader” or provide radiologists new decision-making supporting tools . Most CAD schemes include three steps: (1) detect suspicious regions that may depict tumors, (2) segment the targeted regions, and (3) train a machine learning model that fuses multiple image features computed from the segmented regions . Despite great research enthusiasm and effort, false-positive detection rates of CAD schemes remain high , and whether using CAD can add values in clinical practice to help improve radiologists’ performance in reading and interpreting mammograms remains controversial . The technical challenges and limitations in developing CAD schemes may include but not limited to (1) difficulty in accurate segmentation of the targeted tumors from the images due to tissue overlap, connection, and fuzzy boundary, which reduce the accuracy and reproducibility of the computed image features to build robust machine learning models ; (2) high false-positive cues in the detection schemes, which can mislead radiologists and reduce their performance ; (3) use of small or biased training datasets, which causes overfitting and reduces robustness of CAD schemes when applied to new testing cases ; (4) higher correlation of the detection results between CAD and radiologists, which reduces the clinical utility of CAD as “the second reader” ; and (5) difficulty in developing multi-image-based CAD schemes  to fuse and compare variation of the image features in the longitudinal images  or different views of images . Thus, exploring new approaches in developing CAD schemes or machine learning models remains an unsolved but important research topic in the field of CAD-related medical imaging informatics.
Due to the difficulty in accurate tumor segmentation and identification of optimal handcrafted image features, great research effort has recently been made to apply deep learning models in developing CAD schemes . Although developing deep learning models can avoid tumor segmentation, it requires “big data” (availability of large training datasets). Thus, besides working to investigate how to optimally apply the deep learning method to develop robust CAD schemes using the small image datasets [26, 27, 28], we also investigate a different conventional machine learning approach that uses global image features computed from the entire imaged organs (i.e., breast, lung, and abdominal region) to train prediction models without suspicious region or tumor segmentation (as used in the conventional tumor-based schemes) or predefine the regions of interests with a fixed size (as used in many deep learning-based schemes). The new global image feature analysis-based models can be either implemented to build new case-based CAD schemes or fused with the existing tumor-based CAD schemes. The hypothesis of this new approach is based on the scientific premise and preliminary study results reported in the literature, which show that image features computed from background parenchymal or specific non-tumor regions also contain high discriminatory information to help predict cancer risk  and prognosis .
To test our hypothesis, we conducted several studies to develop and test a variety of new machine learning models using or adding global image features to predict cancer risk and cancer prognosis after surgery or chemotherapy. This study reviews several CAD schemes implemented with the global image feature analysis-based machine learning models developed in our recent studies. These models were built using different types of medical images including full-field digital mammography (FFDM), magnetic resonance imaging (MRI) and computed tomography (CT) images for breast, lung, and ovarian cancers. Moreover, to demonstrate the robustness of this new concept and the developed models, new experiments and data analysis results are also included in this study. Specifically, the basic concept or structure of the models is presented in Section 2, the new experiments and data analysis results are reported in Section 3, and the unique characteristics of this new approach and future research directions are discussed in Section 4 of this paper.
New quantitative imaging models
Prediction of short-term breast cancer risk
Breast cancer is the most prevalent cancer in women. Detection of invasive cancer at an early stage plays an important role in cancer treatment and reduction of patients’ mortality rates. However, due to the extremely low cancer detection yield (i.e., ≤ 0.3–0.5%) and higher false-positive recall rate (i.e., ≥ 10%), the efficacy of current population-based breast cancer screening using medical imaging (i.e., mammography with or without adjunction of ultrasound or MRI ) is quite low. To address and overcome this issue, developing a new risk-based breast cancer screening paradigm has been attracting research interest . The objective of the new approach is to develop a new model to predict short-term cancer risk, which enables to stratify the general screening population into two groups. Women assigned to the higher-risk group should be screened in a short interval (i.e., annual screening) after the negative screening of interest, while women assigned to the low-risk group can be screened in a longer interval until their short-term risk is significantly increased in the future reassessment. However, the current breast cancer risk factor or prediction models based on the epidemiology studies do not have discriminatory power to predict short-term risk in a woman developing breast cancer after a negative screening of interest and help a woman or her physician decide the optimal screening interval and/or method .
In our studies, we developed and tested a new risk prediction model based on the following scientific evidence or experimental observations: (1) humans naturally show bilateral symmetry in paired morphological traits, including the breasts; (2) bilateral asymmetry of breast tissue patterns is an important imaging phenotype or marker associated to biological processes; (3) bilateral mammographic tissue asymmetry and its change over time are commonly assessed by the radiologists in their decision-making process of cancer detection; and (4) CAD schemes can yield more consistent results in quantifying mammographic density and their bilateral asymmetry by avoiding inter-reader variability . Thus, we build a new model to predict short-term breast cancer risk based on the computation and analysis of bilateral mammographic density and tissue asymmetry of the negative images between the left and right breast. The goal is to predict the likelihood of a woman developing detectable cancer in the short term (i.e., the next subsequent annual mammographic screening).
Prediction of lung cancer recurrence risk
Recently, promoting and implementing lung cancer screening programs using low-dose CT imaging and other incident findings in detecting or diagnosing chromatic lung diseases have resulted in detecting more early-stage lung cancers. Most of the cases are non-small-cell lung cancers (NSCLCs). Although early cancer detection and surgical treatment help reduce the mortality rate of patients diagnosed with early-stage NSCLC, lung cancer recurrence rates after surgery of tumor resection is still high (i.e., ranging from 30% to 60% as reported in the literature ). To more accurately stratify patients into two groups of having a high and low risk of cancer recurrence, researchers have investigated many genomic biomarkers to predict the risk of cancer recurrence in patients with early-stage NSCLC . Patients with high cancer recurrence risk need to be continuously treated after surgery using other methods (i.e., radiation therapy or chemotherapy) to reduce the risk of cancer recurrence and increase the cancer-free survival.
Prediction of breast cancer response to neoadjuvant chemotherapy
Currently, neoadjuvant chemotherapy has been increasingly used as first-line therapy in patients diagnosed with locally advanced breast cancer. Previous studies have demonstrated long-term prognosis among the high proportion of patients who have good post-treatment imaging responses, such as those who have a pathologic complete response (pCR) at the time of surgery. However, due to the lack of accurate prognostic markers, a higher percentage of patients with pCR still undergo unnecessary and aggressive surgery. Additionally, other patients who do not respond to chemotherapy suffer from unnecessary toxic side effects. Such overtreatment increases patients’ morbidity and mortality rates . As a result, many research groups have attempted to develop CAD schemes of breast MRI in assessing tumor response to chemotherapies by comparing the changes of the contrast-enhanced kinetic image features computed from the tumors segmented from the MR images obtained before and after chemotherapy.
Prediction of chemotherapy efficacy in patients with ovarian cancer
Ovarian cancer has the highest mortality rate in gynecologic malignancy. Most ovarian cancers detected in the clinical practice (> 85%) are epithelial ovarian cancers (EOCs) and they are typically diagnosed in the advanced stage with metastatic tumors spreading to other organs of the body. In these patients, angiogenesis plays a fundamental role in the pathogenesis of EOC, which results in higher vascular endothelial growth factor expression and promotes tumor growth, ascites, and metastases. Thus, new chemotherapies (i.e., bevacizumab) that target the angiogenesis-specific pathways were developed and tested in many clinical trials. However, studies have shown that some patients received benefits with the increased progression-free survival (PFS) or overall survival (OS), while others did not receive benefits due to the high toxicity and other serious side-effects . How to effectively identify patients with EOC who are most likely to benefit from receiving bevacizumab or other antiangiogenic therapies remains an unsolved clinical issue in the treatment of patients with EOC. Thus, identifying effective imaging markers and/or developing prediction models can help address or solve this clinical issue.
Steps in training machine learning models using small datasets
Although we can compute a large pool of image features (i.e., morphological, density heterogeneity, and texture) from the original medical images or transformed maps (i.e., frequency domain), identifying small sets of optimal image features from the initially large feature pools is an important and challenging task to improve the performance and robustness of the multiple image feature fusion-based machine learning models. Additionally, we also often encounter two difficult issues related to image datasets. The first is the relatively small number of cases, and the second is the unbalance between two case classes (i.e., more negative cases than positive cases). Thus, to minimize or reduce case selection bias in searching for optimal features and training machine learning models in our studies, we perform the following model training and testing steps.
First, a synthetic minority oversampling technique (SMOTE)  is applied to balance the number of cases in two classes to achieve the ratio close to 1:1 (if needed). The details of applying the SMOTE algorithm in our model development have been reported in several studies (i.e., ref. ). Second, a feature selection algorithm (i.e., modified sequential forward floating selection ) or feature regeneration algorithm (i.e., locality preserving projection ) is applied to search for and build an optimal feature set or vector. Third, a machine learning model (i.e., artificial neural network and support vector machine) is trained and tested. Because of the availability of the relatively small image dataset, we typically apply a cross-validation method, such as a 10-fold or leave-one-case-out (LOCO) cross-validation method, to train and test the model. When using a cross-validation method, the first and second steps of SMOTE and feature selection are embedded into the cross-validation loop to train and test the model. Thus, the testing cases will be excluded from the above two processes of generating synthetic data and feature selection or regeneration. Each testing case (including the synthetic case) is tested once and receive one prediction score (ranging from 0 to 1) generated by the trained machine learning model. Lastly, the synthetic data or cases are removed, and then the area under a ROC curve (AUC) or adjusted odds ratio (OR) is used as evaluation indices to assess model performance. Furthermore, by applying an operation threshold on the model-generated risk scores, we build a confusion matrix to compute the overall prediction accuracy (i.e., sensitivity, specificity, and positive predictive value).
Experiments and results
Response of breast tumors to neoadjuvant chemotherapies using the global kinetic image features computed from the breast MRI performed before chemotherapy ;
In this study, we report two new sets of experiments and data analysis results, which have not been previously published in the peer-reviewed journal papers.
Prediction of short-term breast cancer risk
A retrospectively assembled image dataset, which involves images acquired from 1045 women who underwent at least two annual mammographic screening, was used in one recent study. Specifically, each case had two subsequent screenings defined as “current” and “prior” screenings with a time interval ranging from 12 to 18 months. All “prior” images were detected as negative by the radiologists in the original mammography screening. In “current” screenings, cancers were detected and confirmed in 402 cases, while the remaining 643 remain negative. Each screening mammography has 4 images in the CC and MLO view of the left and right breasts. Thus, all “prior” negative mammograms were selected and processed by the model to predict the risk or likelihood of developing cancer that is detectable in the next (“current”) mammographic screening.
The CAD scheme computes six image feature differences between the left and right mammograms. Briefly, from one bilateral pair of either CC or MLO view images and the processed or transformed image maps, the scheme computes bilateral difference of (1) the average mammographic density, (2) fibro-glandular tissue volume, (3) size of the CAD-detected focal asymmetric regions, (4) average pixel values computed from the two local breast tissue fluctuation maps, (5) average pixel values computed from the maps generated using a difference-of-Gaussian filter, and (6) overall mammographic density (similar to the breast imaging reporting and data system [BIRADS]).
Comparison of the adjusted odds ratios (OR) and 95% CIs between the new model-generated risk scores and mammographic density ratings by the radiologists
Number of cases
New prediction model
Prediction of lung cancer recurrence risk
A retrospective dataset involving 107 patients diagnosed with early-stage NSCLC was used in this experiment. Postoperatively, 26 patients had a cancer recurrence, while 81 had disease-free survival (DFS) in 3 years. A CAD scheme was applied in processing CT images of these patients, which were acquired preoperatively, and compute to a pool of 56 image features, which includes 35 tumor-related morphological, CT number distribution, and texture features computed from the segmented three-dimensional tumor volume as reported in our previous study that built a prediction model using tumor-related image features only  and 21 emphysema-related features computed from the entire lung volume of the CT images. These 21 features are divided into three subgroups representing emphysema volume and shape, density distribution and heterogeneity, and gray-level texture-based features.
First, due to unbalanced data (26 positive cases for cancer recurrence and 81 negative cases for DFS in 3 years), a SMOTE algorithm was applied to add synthetic data and double the “positive” test cases from 26 to 52 to improve case balance in the two classes. Thus, a total of 133 cases were used to build and optimize random forest tree models. Second, a correlation-based feature selection (CFS) algorithm implemented in Weka data mining software package with a best-first heuristic feature selection criterion  was applied to select a subset of optimal features from the initial feature pool. This feature selection method evaluates the value of a subset of features with respect to the discriminative power of each individual feature along with the degree of redundancy between the features. Using the LOCO cross-validation method, we sorted the frequency of the selected top performed features and finally assembled a small and optimal set of eight image features, which include five tumor-related features and three emphysema-related features. Lastly, we built and compared three random forest models using five tumor-related features, three emphysema-related features, and all eight features. The LOCO cross-validation method was used to train the model and validate its performance.
The selected three top-performing emphysema-related features are entropy, autocorrelation, and uniformity of global emphysema patterns. AUC values using these three individual features to predict the risk of lung cancer recurrence are 0.63 ± 0.07, 0.57 ± 0.06 and 0.66 ± 0.06, respectively. It indicates that, unlike the subjective reading of radiologists, which estimates a percentage of emphysema regions over the entire lung volume, the CAD scheme can detect and quantify more image features that have higher or improved discriminatory power than the percentage of emphysema in the total lung volume. By comparing these three random forest tree models trained using tumor-related features, emphysema-related features, and combined features, AUC values are 0.79 ± 0.05, 0.70 ± 0.07, and 0.84 ± 0.03, respectively, which indicates that adding global emphysema-related features significantly increases prediction performance (p < 0.05).
Comparison of three confusion matrices generated by three machine learning models trained using tumor-related features (Model 1), emphysema-related features (Model 2), and combined image features (Model 3) to predict the risk of lung cancer recurrence
Prediction – yes
Prediction – no
Comparison of prediction performance of three machine learning models trained using tumor-related features (Model 1), emphysema-related features (Model 2), and combined image features (Model 3)
Positive predictive value
Negative predictive value
To help establish new precision or personalized medicine, identifying new genomic biomarkers and quantitative imaging markers and developing the multiple feature fusion-based machine learning models have been attracting broad research interest in the biomedical research field recently. Based on the newly proposed radiomics concept and several preliminary studies, it is feasible to identify and compute quantitative medical imaging markers to assist the prediction of cancer risk and prognosis. Since medical imaging is commonly used in clinical practice, developing new prediction models based on medical image features provides an unprecedented opportunity to support radiologists, oncologists and/or surgeons in their decision making of cancer diagnosis and treatment at low cost . Recently, we have been exploring and developing several new CAD-supported machine learning models. These models can be applied to radiographic images of human bodies (i.e., presented in this paper) or animal models  and to digital histopathology images . In this study, we reviewed several prediction models trained using global or non-tumor-related image features computed from a variety of medical images (FFDM, MRI, and CT) for breast, lung, and ovarian cancers with new experimental results. From our recent studies, we observed several unique characteristics of developing new image processing and machine learning models involving global or non-tumor-related image features.
First, although many cancer risk factors have been identified and used in existing cancer risk prediction models based on epidemiology studies, these models lack the clinically accepted discriminatory power to help establish new risk-based or personalized cancer screening programs . For example, breast MRI has the highest cancer detection sensitivity  and has been recommended by the American Cancer Society as an adjunct screening tool to mammography in women with increased cancer risk (i.e., > 20–25% of lifetime risk). However, it excludes most women who have mammography-occult early breast cancers. In contrast, annual MRI screening in small groups of women at the elevated risk has a quite low cancer detection yield (i.e., 2–3%) . However, quantitative imaging markers are time-dependent, which makes them significantly different from most genetic and lifestyle-based risk factors used in existing cancer risk models. Thus, the risk scores generated by imaging markers or machine learning models will increase as the time interval to having or developing image detectable cancer shortens . As a result, cancer risk prediction models based on quantitative image features have advantages in predicting short-term cancer risk after having negative screening of interest, which can help stratify the general cancer screening population into different groups with variable screening intervals to improve efficacy of cancer screening (i.e., increase cancer detection yield and reduce unnecessary biopsies of benign cases). Since medical image features vary as cancer risk increases or decreases, it has the potential to establish different screening intervals or strategies for individuals at different life periods. In our studies, we observed that, although tumors are not detectable in the negative screening images, the global image features computed from these images carry useful information or markers to predict cancer risk. For example, we demonstrated the advantages of developing short-term breast cancer risk prediction models based on bilateral mammographic density and tissue pattern asymmetry. A similar concept may also be applied in screening other types of cancer. For example, based on the current guideline issued by the US Preventive Services Task Force, annual screening for lung cancer using low-dose CT images only applies to adults aged 55–80 years who have a 30 pack-year smoking history and currently smoke or have quit smoke within the past 15 years. This guideline also has similar weaknesses of low cancer detection yield among the targeted smoking groups and omitting the majority of nonsmokers who can also develop lung cancer. To address this challenge, researchers can also develop new short-term lung cancer risk prediction models using the quantitative image features computed from the negative lung CT images, such as using the features related to the heterogeneity of COPD patterns .
Second, the most current CAD schemes are tumor-based schemes aiming to detect suspicious tumors, classify malignant and benign tumors, and predict or assess tumor response to chemotherapies. The challenges of using these tumor-based CAD schemes include (1) high false-positive rates, which may impose a negative impact on radiologists and reduce their image reading performance [17, 19], and (2) difficulty and error in tumor segmentation, which reduces the accuracy and robustness of the computed image features . The case-based CAD schemes only use global image features without detecting tumor locations and segmenting tumor regions. This makes developing global image feature analysis model-based CAD schemes simpler and probably more robust. However, the new case-based CAD schemes do not directly compete with tumor-based CAD schemes. For example, although case-based CAD schemes may not be used as “a second reader” as the current tumor-based CAD schemes, they have the potential to be used as prescreening tools to help stratify image cases into high- and low-risk groups (e.g., like prescreening performed by technologists ). Using the model-generated prediction scores (or “warning” signs), radiologists can focus on reading and interpreting higher risk cases to increase detection sensitivity by reducing the risk of missing or overlooking subtle tumors, while reducing image reading time in lower risk cases. Thus, adding this prescreening process may help improve both the accuracy and efficiency of radiologists in reading and interpreting medical images in the busy or high-volume clinical practice.
Third, our studies also demonstrated that the models developed using global image features can not only generate higher or equivalent discriminatory power compared to the conventional tumor-based models but also provide complementary information due to the lower correlation between the image features and prediction scores generated by these two types of models. Thus, an optimal fusion of quantitative image features computed from both the tumor and global parenchymal regions can further improve model performance in detecting suspicious breast tumors  and predicting the risk of lung cancer recurrence (Tables 2 and 3). Such a fusion approach can also be expanded to optimally combine imaging markers and genomic biomarkers to improve model performance in cancer risk prediction, tumor diagnosis, and prognosis assessment [37, 56].
Fourth, the efficacy of cancer treatment (i.e., using chemotherapies) depends on not only the characteristics of tumors but also overall health issues of patients. Thus, it is also important to identify new imaging features or markers computed from other non-tumor regions. For example, we demonstrated that using a logistic regression model built by quantitative image features of adiposity can yield significantly higher accuracy than using the bone mass index to predict the benefit in patients with ovarian cancer who received bevacizumab-based chemotherapy . Our machine learning model combining the computed image features associated with SFA, VFA, and TPA has been applied to analyze image data acquired from a large US national clinical trial (GOG 218), which involves > 1800 patients diagnosed with ovarian cancer. The study result supports the feasibility of using these non-tumor-related imaging markers as new prognostic prediction markers .
Therefore, this study reviewed several machine learning models based on the analysis of image features computed from global or non-tumor regions and presented new experimental data. This is a new research direction in CAD-related medical imaging informatics field, which opens an opportunity for researchers to explore new research and application tasks. For example, we recently investigated the feasibility of developing a new CAD scheme based on global mammographic image features to classify malignant and benign cases in which suspicious tumors have been detected by radiologists. The scheme initially computes 59 global mammographic image features, followed by applying a particle swarm optimization algorithm to search for optimal features and training a support vector machine model to predict the likelihood of malignancy. When using a relatively small dataset involving 134 malignant and 141 benign cases, the model yields a performance of AUC = 0.79 ± 0.07 , which is highly comparable to the performance of applying tumor-based CAD schemes in classifying malignant and benign tumors .
We would like to acknowledge the support received from the Peggy and Charles Stephenson Cancer Center, University of Oklahoma, USA, and thank the contribution of several former members worked in our research laboratory including Drs. Maxine Tan (a postdoc research associate) and Shiju Yan (a visiting scholar), the graduate students of Nastaran Emaminejad (MS) and Yunzhi Wang (PhD).
BZ proposed the study hypothesis and supervised studies (experiments and data analysis). All other co-authors led the effort to design study protocols in the corresponding studies and/or made contribution to conduct experiments and data analysis in different studies discussed in this paper, such as YQ in developing and evaluating CAD of ovarian cancer for predicting cancer prognosis, FA in CAD of breast MRI for predict efficacy of neoadjuvant chemotherapies, SM in CAD of cancer detection, MH in CAD of predicting short-term cancer risk, and GD in CAD of predicting the likelihood of cases being malignant. In addition, BZ took the lead to draft the manuscript and all other authors have made comments and suggestions to help revise the manuscript. All authors read and approved the final manuscript.
The studies mentioned in this paper were supported in part by
(1) Grants R01 CA160205 and R01 CA197150 from the National Cancer Institute, National Institutes of Health, USA.
(2) Grant HR15–016 from Oklahoma Center for the Advancement of Science and Technology, USA.
The authors declare that they have no competing interests.
- 8.Carney PA, Miglioretti DL, Yankaskas BC, Kerlikowske K, Rosenberg R, Rutter CM et al (2003) Individual and combined effects of age, breast density, and hormone replacement therapy use on the accuracy of screening mammography. Ann Intern Med 138(3):168–175. https://doi.org/10.7326/0003-4819-138-3-200302040-00008 CrossRefGoogle Scholar
- 9.Hubbard RA, Kerlikowske K, Flowers CI, Yankaskas BC, Zhu WW, Miglioretti DL (2011) Cumulative probability of false-positive recall or biopsy recommendation after 10 years of screening mammography: a cohort study. Ann Intern Med 155(8):481–492. https://doi.org/10.7326/0003-4819-155-8-201110180-00004 CrossRefGoogle Scholar
- 30.Wang YZ, Thai T, Moore K, Ding K, McMeekin S, Liu H et al (2016) Quantitative measurement of adiposity using CT images to predict the benefit of bevacizumab-based chemotherapy in epithelial ovarian cancer patients. Oncol Lett 12(1):680–686. https://doi.org/10.3892/ol.2016.4648 CrossRefGoogle Scholar
- 32.Brawley OW (2012) Risk-based mammography screening: an effort to maximize the benefits and minimize the harms. Ann Intern Med 156(9):662–663. https://doi.org/10.7326/0003-4819-156-9-201205010-00012 CrossRefGoogle Scholar
- 41.Burger RA, Brady MF, Bookman MA, Monk BJ, Walker JL, Homesley HD et al (2014) Risk factors for GI adverse events in a phase III randomized trial of bevacizumab in first-line therapy of advanced ovarian cancer: a gynecologic oncology group study. J Clin Oncol 32(12):1210–1217. https://doi.org/10.1200/JCO.2013.53.6524 CrossRefGoogle Scholar
- 43.Wang YZ, Qiu YC, Thai T, Moore K, Liu H, Zhang B (2017) A two-step convolutional neural network based computer-aided detection scheme for automatically segmenting adipose tissue volume depicting on CT images. Comput Methods Prog Biomed 144:97–104. https://doi.org/10.1016/j.cmpb.2017.03.017 CrossRefGoogle Scholar
- 44.Mirniaharikandehei S, Patil O, Aghaei F, Wang YZ, Zheng B (2017) Exploring a new quantitative image marker to assess benefit of chemotherapy to ovarian cancer patients. In: Proceedings of SPIE medical imaging 2017: imaging informatics for healthcare, research, and applications, SPIE, Orlando, Florida, United States, 13 March 2017. https://doi.org/10.1117/12.2254163 CrossRefGoogle Scholar
- 47.Heidari M, Khuzani AZ, Hollingsworth AB, Danala G, Mirniaharikandehei S, Qiu YC et al (2018) Prediction of breast cancer risk using a machine learning approach embedded with a locality preserving projection algorithm. Phys Med Biol 63(3):035020. https://doi.org/10.1088/1361-6560/aaa1ca CrossRefGoogle Scholar
- 50.Witten IH, Frank E, Hall MA (2011) Data mining: practical machine learning tools and techniques, 3rd edn. Morgan Kaufmann, BurlingtonGoogle Scholar
- 57.Wade KNS, Brady MF, Thai T, Wang Y, Zheng B, Salani R et al (2019) Measurements of adiposity as prognostic biomarkers for survival with anti-angiogenic treatment in epithelial ovarian cancer: an NRG oncology/gynecologic oncology group ancillary data analysis of GOG 218. Gynecol Oncol. https://doi.org/10.1016/j.ygyno.2019.07.020 CrossRefGoogle Scholar
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.