Data Collection, Statistical Analysis and Clustering Studies of Cancer Dataset from Viziayanagaram District, AP, India

  • T. Panduranga Vital
  • G. S. V. Prasada Raju
  • D. S. V. G. K. Kaladhar
  • Tarigoppula V. S. Sriram
  • Krishna Apparao Rayavarapu
  • P. V. Nageswara Rao
  • S. T. P. R. C. Pavan Kumar
  • S. Appala Raju
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 249)

Abstract

Cancer detection is one of major research that can be processed through datasets and data mining techniques. The data has been collected from Vizianagaram district (Village) during 2013 with 328 instances and 28 attributes (Gender, Age, Cancer Type, Family_members, Drinking, Smoking, Tea, Coffee, perfumes, Morning_eat, Travelling, Wake_up, Sleep, Tensions, Cool_drinks, Icecream, Height, weight, hair_loss, Marital, milk, bath, Oil, Fast_food, other diseases, Mobile, Sports, Mosquito_replents). The dataset has been analyzed using weka version 3.6.3and Orange softwares v2.7. The histogram shows higher instances for Lung cancer (56), Mouth (40), Bone (40), Skin (32), and Colon (24). There are more number of instances observed in Males (53.7%) compared with females (46.3%). The disease in married people are more (61%) compared to unmarried (39%) with average age groups observed at 33.78±10.12, Height as 159.02±9.79 cms and weight as 61.55±11.69 Kgs. Nearly 90.2% patients has no other diseases, 136 patients (41.5%) prefer drinking alcohol, 72 patients (22%) prefer smoking, 208(63.4%) prefer drinking tea, 96 (29.3%) prefer drinking coffee, 216(65%) prefer taking rice, 80(24.4%) prefer taking cool drinks, no person like ice creams, 88 (26.8%) prefer taking milk, 238(63.4%) prefer taking sunflower oil in cooking and 68(26.8%) prefer taking fast food. The data shows hair loss, use of mobile phones and mosquito repellents as major factors in cancer. It concludes that Age, Gender, Height, weight, marital status, tea, walking, hairloss, mobile and mosquito repellents are major factors/attributes in cancer occurrence.

Keywords

Cancer Statistical analysis Clustering Vizianagaram 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Delen, D., Walker, G., Kadam, A.: Predicting breast cancer survivability: a comparison of three data mining methods. Artificial Intelligence in Medicine 34(2), 113–127 (2005)CrossRefGoogle Scholar
  2. 2.
    Dupuy, A., Simon, R.M.: Critical review of published microarray studies for cancer outcome and guidelines on statistical analysis and reporting. Journal of the National Cancer Institute 99(2), 147–157 (2007)CrossRefGoogle Scholar
  3. 3.
    Capocaccia, R., Gatta, G., Roazzi, P., Carrani, E., Santaquilani, M., De Angelis, R., Tavilla, A.: The EUROCARE-3 database: methodology of data collection, standardisation, quality control and statistical analysis. Annals of Oncology: Official Journal of the European Society for Medical Oncology/ESMO 14, v14 (2003)Google Scholar
  4. 4.
    De Angelis, R., Francisci, S., Baili, P., Marchesi, F., Roazzi, P., Belot, A., Crocettih, E., Puryi, P., Knijnc, A., Colemanj, M., Capocacciaa, R.: The EUROCARE-4 database on cancer survival in Europe: data standardisation, quality control and methods of statistical analysis. European Journal of Cancer 45(6), 909–930 (2009)CrossRefGoogle Scholar
  5. 5.
    Gill, S., Loprinzi, C.L., Sargent, D.J., Thomé, S.D., Alberts, S.R., Haller, D.G., Benedetti, J., Francini, G., Shepherd, L.E., Seitz, J.F., Labianca, R., Chen, W., Cha, S.S., Heldebrant, M.P., Heldebrant, R.M.: Pooled analysis of fluorouracil-based adjuvant therapy for stage II and III colon cancer: who benefits and by how much? Journal of Clinical Oncology 22(10), 1797–1806 (2004)CrossRefGoogle Scholar
  6. 6.
    Walsh, D., Donnelly, S., Rybicki, L.: The symptoms of advanced cancer: relationship to age, gender, and performance status in 1,000 patients. Supportive Care in Cancer 8(3), 175–179 (2000)CrossRefGoogle Scholar
  7. 7.
    Visbal, A.L., Williams, B.A., Nichols III, F.C., Marks, R.S., Jett, J.R., Aubry, M.C., Edell, E.S., Wampfler, J.A., Molina, J.R., Yang, P.: Gender differences in non–small-cell lung cancer survival: an analysis of 4,618 patients diagnosed between 1997 and 2002. The Annals of thoracic surgery 78(1), 209–215 (2004)CrossRefGoogle Scholar
  8. 8.
    Valero de Bernabé, J., Soriano, T., Albaladejo, R., Juarranz, M., Calle, M.E., Martínez, D., Domínguez-Rojas, V.: Risk factors for low birth weight: a review. European Journal of Obstetrics & Gynecology and Reproductive Biology 116(1), 3–15 (2004)CrossRefGoogle Scholar
  9. 9.
    Courtenay, W.H.: Behavioral factors associated with disease, injury, and death among men: Evidence and implications for prevention. The Journal of Men’s Studies 9(1), 81–142 (2000)CrossRefGoogle Scholar
  10. 10.
    Kaladhar, D.S.V.G.K., Chandana, B., Kumar, P.B.: Predicting cancer survivability using Classification algorithms. LMT 34(65.7), 96–106 (2011)Google Scholar
  11. 11.
    Kaladhar, D.S.V.G.K., Pottumuthu, B.K., Rao, P.V.N., Vadlamudi, V., Chaitanya, A.K., Reddy, R.H.: The Elements of Statistical Learning in Colon Cancer Datasets: Data Mining, Inference and Prediction. Algorithms Research 2(1), 8–17 (2013)Google Scholar
  12. 12.
    Donnelly, S., Walsh, D.: The symptoms of advanced cancer. Seminars in Oncology 22(2), 67 (1995)Google Scholar
  13. 13.
    Wu, X.C., Chen, V.W., Steele, B., Ruiz, B., Fulton, J., Liu, L., Carozza, S.E., Greenlee, R.: Subsite‐specific incidence rate and stage of disease in colorectal cancer by race, gender, and age group in the United States, 1992–1997. Cancer 92(10), 2547–2554 (2001)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • T. Panduranga Vital
    • 1
  • G. S. V. Prasada Raju
    • 2
  • D. S. V. G. K. Kaladhar
    • 3
  • Tarigoppula V. S. Sriram
    • 4
  • Krishna Apparao Rayavarapu
    • 3
  • P. V. Nageswara Rao
    • 5
  • S. T. P. R. C. Pavan Kumar
    • 3
  • S. Appala Raju
    • 1
  1. 1.CSERaghu Engineering CollegeVisakhapatnamIndia
  2. 2.Dept. of Computer Science, School of Distance EducationAndhra UniversityVisakhapatnamIndia
  3. 3.Dept. of BioinformaticsGITAM UniversityVisakhapatnamIndia
  4. 4.Raghu Engineering CollegeVisakhapatnamIndia
  5. 5.Department of Computer Science and EngineeringGIT, GITAM UniversityVisakhapatnamIndia

Personalised recommendations