Statistics in Cancer: Diagnosis, Disease Progression, Treatment Efficacy, and Patient Survival Studies

Chakrabartty, Satyendra Nath; Talukdar, Gopesh Chandra

doi:10.1007/978-981-16-4752-9_22

Satyendra Nath Chakrabartty⁴ &
Gopesh Chandra Talukdar⁵

749 Accesses

Abstract

This article proposes a simple nonparametric measure for diagnosis of cancer and assessing cancer intensity for an individual, without resorting to group data or reduction of dimensionality or scaling or finding weights. The measure also identifies the critical areas/variables requiring attention, can be applied for all non-nominal data, can be used to find mean, variance, and confidence interval for group data, and facilitates statistical tests of hypothesis.

The cancer intensity facilitates ranking/classifying a group of patients along with quantifying progress of treatment at individual and group level. Using suitably designed group data, attempt can be made to find a small interval of values of cancer intensity for each type of cancer, which may be associated with Stage IV cancer or metastatic cancer. The proposed measure of cancer intensity offers an alternative approach for estimation of survival function of cancer patients. This study leads to a number of new areas of statistical analysis in cancer treatment. An empirical study will be of vital interest based on this theoretical study.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 189.00; Price excludes VAT (USA)

Softcover Book: USD 249.99; Price excludes VAT (USA)

Hardcover Book: USD 249.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Alf EF, Grossberg JM (1979) The geometric mean: confidence limits and significance tests. Percept Psychophys 26(5):419–421
Article Google Scholar
Aliferis CF, Statnikov A, Tsamardinos I, Mani S, Koutsoukos XD (2010) Local causal and Markov blanket induction for causal discovery and feature selection for classification Part I: Algorithms and empirical evaluation. J Mach Learn Res 11(1):171–234
Google Scholar
Berkson J, Gage RP (1950) Calculation of survival rates for cancer. In: Proceedings of the staff meetings. Mayo Clinic, 25(11), 270–286
Google Scholar
Breslow NE, Day NE, Schlesselman JJ (1982) Statistical methods in cancer research. Volume 1—the analysis of case-control studies. J Occup Environ Med 24(4):255–257
Google Scholar
Chakrabartty SN (2014) Scoring and analysis of Likert scale: few approaches. J Knowl Manage Inform Technol 1(2)
Google Scholar
Chakrabartty SN (2018) Better composite environmental performance index. Interdiscip Environ Rev 19(2):139–152
Article Google Scholar
Collett D (2003) Modeling of survival data in medical research. Chapman Hall, London, UK
Google Scholar
Ebert U, Welsch H (2004) Meaningful environmental indices: a social choice approach. J Environ Econ Manage 47(2):270–283
Article Google Scholar
Gehan EA (1969) Estimating survival functions from the life table. J Chronic Dis 21(9–10):629–644
Article CAS PubMed Google Scholar
Jamieson S (2004) Likert scales: how to (ab) use them. Med Educ 38:1212–1218
Article Google Scholar
Jan B, Shah SWA, Shah S, Qadir MF (2005) Weighted Kaplan Meier estimation of survival function in heavy censoring. Pak J Stat 21(1):55–63
Google Scholar
Jiang H, An L, Baladandayuthapani V, Auer PL (2014) Classification, predictive modeling, and statistical analysis of cancer data (a). Cancer Inform 13(2):1–3
PubMed PubMed Central Google Scholar
Kaplan EL, Meier P (1958) Nonparametric estimation from incomplete observations. J Am Statist Assoc 53(282):457–481
Article Google Scholar
Norris N (1940) The standard errors of the geometric and harmonic means and their application to index numbers. Ann Math Statist 11(4):445–448
Article Google Scholar
Ramaswamy S, Tamayo P, Rifkin R, Mukherjee S, Yeang CH, Angelo M, Ladd C, Reich M, Latulippe E, Mesirov JP, Poggio T (2001) Multiclass cancer diagnosis using tumor gene expression signatures. Proc Natl Acad Sci 98(26):15149–15154
Article CAS PubMed PubMed Central Google Scholar
Shafiq M, Shah S, Alamgir M (2007) Modified weighted Kaplan-Meier estimator. Pak J Statist Oper Res 3(1):39–44
Article Google Scholar
Shi Q, Sargent DJ (2015) Key statistical concepts in cancer research. Clin Adv Hematol Oncol: H&O 13(3):180–185
Google Scholar
Sprangers MAG, Cull A, Bjordal K, Groenvold M, Aaronson NK (1993) The European Organization for Research and Treatment of cancer approach to quality of life assessment: guidelines for developing questionnaire modules. Qual Life Res 2(4):287–295
Article CAS PubMed Google Scholar
Tan AC, Gilbert D (2003) Ensemble machine learning on gene expression data for cancer classification. Appl Bioinform 2
Google Scholar
Wang Y, Tetko IV, Hall MA, Frank E, Facius A, Mayer KF, Mewes HW (2005) Gene selection from microarray data for cancer classification—a machine learning approach. Comput Biol Chem 29(1):37–46
Article PubMed Google Scholar
Wu B, Abbott T, Fishman D, McMurray W, Mor G, Stone K, Ward D, Williams K, Zhao H (2003) Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data. Bioinformatics 19(13):1636–1643
Article CAS PubMed Google Scholar

Download references

Author information

Authors and Affiliations

Indian Ports Association, Indian Maritime University, Noida, Uttar Pradesh, India
Satyendra Nath Chakrabartty
ESI Institute of Pain Management, ESI Hospital Sealdah Premises, Kolkata, West Bengal, India
Gopesh Chandra Talukdar

Authors

Satyendra Nath Chakrabartty
View author publications
You can also search for this author in PubMed Google Scholar
Gopesh Chandra Talukdar
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, Institute of Science, Banaras Hindu University, Varanasi, Uttar Pradesh, India
S. K. Basu
Department of Oncogene Regulation, Chittaranjan National Cancer Institute, Kolkata, West Bengal, India
Chinmay Kumar Panda
Department of Pain and Palliative Medicine, ESI Institute of Pain Management, Kolkata, West Bengal, India
Subrata Goswami

Appendix

1.1 Statistical Notes

1.
Chi-square test is a nonparametric test that makes comparisons (usually of cross tabulated data) between two or more samples on the observed frequency of values with expected frequency of values and also, used as test of goodness of fit of log linear models.
2.
Regression analysis establishes relationship of the dependent variable with one or more of the independent variables.
3.
Logistic regression analysis deals with dependent variable in binary and one or more independent variables in nominal, ordinal, interval, or ratio level.
4.
Factor Analysis (FA)/Principal Component Analysis (PCA) are both multivariate statistical techniques for reduction of data/variables. PCA considers linear combination of weighted observed variables to minimize the variance of the observed variables, while FA explains the covariance between the variables.
5.
Wilks’ Lambda is the ratio of the within group sum of squares to the total sum of squares. When observed group means are nearly equal, Wilks’ Lambda will be high and a small lambda occurs when group means differ.
6.
A Bayesian network consists of a set of nodes (random variables) and a set of directed edges (direct dependencies between the variables). Major difficulties are specifying prior probability (a priori probability) and computing a posterior probabilities (a posteriori probability).
7.
The k-nearest neighbors (KNN) algorithm is used primarily in classification problems.
8.
Neural network starts with a set of variables X_i and associated weights W_i for all i = 1, 2, …, n. A function f is determined whose domain is the sums of the weights and range is an output Y. Neural Networks, which have multiple solutions associated with local minima, may not be robust over different samples.
9.
Nearest shrunken centroid classification calculates a standardized centroid for each class in terms of ratio of average gene expression for each gene and the within-class standard deviation for that gene.
10.
Random forest or random decision forest is a machine learning algorithm used for classification, regression, and other tasks by constructing multitude of ensemble of decision trees and merging them together for more accurate and stable prediction.
11.
Support vector machine (SVM) is a nonparametric method for analysis consisting of both classification of tissue samples and explorations of the data for mislabeled or questionable tissue results. Here, the marginal contribution of each component ratio to the score is variable. Moreover, the choice of the input variables has a decisive influence on the performance results.
12.
Cluster analysis groups a set of objects in such a way that objects in the same group are more similar to each other than to those in the other groups.
13.
When the dependent variable is categorical and the independent variables are in interval scale or in ratio scale, Discriminant analysis develops Discriminant functions (df) that discriminates between the categories of the dependent variables.
14.
Log-rank test compares the survival distributions of two samples when the data are right skewed and censored.

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Chakrabartty, S.N., Talukdar, G.C. (2022). Statistics in Cancer: Diagnosis, Disease Progression, Treatment Efficacy, and Patient Survival Studies. In: Basu, S.K., Panda, C.K., Goswami, S. (eds) Cancer Diagnostics and Therapeutics . Springer, Singapore. https://doi.org/10.1007/978-981-16-4752-9_22

Download citation

DOI: https://doi.org/10.1007/978-981-16-4752-9_22
Published: 16 April 2022
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-4751-2
Online ISBN: 978-981-16-4752-9
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)

Publish with us

Policies and ethics

Statistics in Cancer: Diagnosis, Disease Progression, Treatment Efficacy, and Patient Survival Studies

Abstract

Access this chapter

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Appendix

Appendix

1.1 Statistical Notes

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation