The Application of Data Mining Techniques to Oral Cancer Prognosis

  • Wan-Ting Tseng
  • Wei-Fan Chiang
  • Shyun-Yeu Liu
  • Jinsheng Roan
  • Chun-Nan Lin
Transactional Processing Systems
Part of the following topical collections:
  1. Transactional Processing Systems


This study adopted an integrated procedure that combines the clustering and classification features of data mining technology to determine the differences between the symptoms shown in past cases where patients died from or survived oral cancer. Two data mining tools, namely decision tree and artificial neural network, were used to analyze the historical cases of oral cancer, and their performance was compared with that of logistic regression, the popular statistical analysis tool. Both decision tree and artificial neural network models showed superiority to the traditional statistical model. However, as to clinician, the trees created by the decision tree models are relatively easier to interpret compared to that of the artificial neural network models. Cluster analysis also discovers that those stage 4 patients whose also possess the following four characteristics are having an extremely low survival rate: pN is N2b, level of RLNM is level I-III, AJCC-T is T4, and cells mutate situation (G) is moderate.


Oral cancer Survival analysis Data mining Cluster analysis 


  1. 1.
    Centers for Disease Control and Prevention Accessed 29 March 2014.
  2. 2.
    Health Pormotion Administration, Ministry of Health and Weifare Accessed 21 April 2014.
  3. 3.
    Lewin, F., Norell, S. E., Johansson, H., et al., Smoking tobacco, oral snuff, and alcohol in the etiology of squamous cell carcinoma of the head and neck: a population-based case-referent study in Sweden. Cancer 82:1367–1375, 1998.CrossRefGoogle Scholar
  4. 4.
    Ho, P. S., Ko, Y. C., Yang, Y. H., Shieh, T. Y., and Tsai, C. C., The incidence of oropharyngeal cancer in Taiwan: an endemic betel quid chewing area. J. Oral Pathol. Med. 31:213–219, 2002.CrossRefGoogle Scholar
  5. 5.
    Health Pormotion Administration, Ministry of Health and Weifare Accessed 21 December 2013.
  6. 6.
  7. 7.
    Arbes, S. J., Jr., Olshan, A. F., Caplan, D. J., Schoenbach, V. J., Slade, G. D., and Symons, M. J., Factors contributing to the poorer survival of black Americans diagnosed with oral cancer (United States). Cancer Causes Control 10:513–523, 1999.CrossRefGoogle Scholar
  8. 8.
    Bànkfalvi, A., and Piffkò, J., Prognostic and predictive factors in oral cancer: the role of the invasive tumour front. J. Oral Pathol. Med. 29:291–298, 2000.CrossRefGoogle Scholar
  9. 9.
    Schliephake, H., Prognostic relevance of molecular markers of oral cancer—a review. Int. J. Oral Maxillofac. Surg. 32:233–245, 2003.CrossRefGoogle Scholar
  10. 10.
    de Melo, G. M., Ribeiro, K. D. C. B., Kowalski, L. P., and Deheinzelin, D., Risk factors for postoperative complications in oral cancer and their prognostic implications. Arch. Otolaryngol. Head Neck Surg. 127:828–833, 2001.Google Scholar
  11. 11.
    Pande, P., Soni, S., Kaur, J., et al., Prognostic factors in betel and tobacco related oral cancer. Oral Oncol 38:491–499, 2002.CrossRefGoogle Scholar
  12. 12.
    Lu, H. Y., Li, T. C., Tu, Y. K., Tsai, J. C., Lai, H. S., and Kuo, L. T., Predicting long-term outcome after traumatic brain injury using repeated measurements of Glasgow coma scale and data mining methods. J. Med. Syst. 2015. doi: 10.1007/s10916-014-0187-x.Google Scholar
  13. 13.
    Nahar, J., Tickle, K. S., Ali, A. B. M. S., and Chen, Y. P. P., Significant cancer prevention factor extraction: an association rule discovery approach. J. Med. Syst. 35:353–367, 2011.CrossRefGoogle Scholar
  14. 14.
    Chao, C. M., Yu, Y. W., Cheng, B. W., and Kuo, Y. L., Construction the model on the breast cancer survival analysis use support vector machine, logistic regression and decision tree. J. Med. Syst. 2014. doi: 10.1007/s10916-014-0106-1.Google Scholar
  15. 15.
    Yilmaz, N., Inan, O., and Uzer, M. S., A new data preparation method based in clustering algorithms for diagnosis systems of heart and diabetes diseases. J. Med. Syst. 2014. doi: 10.1007/s10916-014-0048-7.Google Scholar
  16. 16.
    Joshi, S., and Nair, M. K., Prediction of heart disease using classification based data mining techniques. Comput Intell Data Min 2:503–511, 2015.Google Scholar
  17. 17.
    Yadav, A. K., and Chandel, S. S., Solar energy potential assessment of western Himalayan Indian state of Himachal Pradesh using J48 algorithm of WEKA in ANN based prediction model. Renew. Energy 75:675–693, 2015.CrossRefGoogle Scholar
  18. 18.
    Yadav, A. K., Malik, H., and Chandel, S. S., Selection of most relevant input parameters using WEKA for artificial neural network based solar radiation prediction model. Renew Sust Energ Rev 31:509–519, 2014.CrossRefGoogle Scholar
  19. 19.
    Koyuncugil, A. S., and Ozgulbas, N., Detecting road maps for capacity utilization decisions by cluster analysis and CHAID decision tress. J. Med. Syst. 34:459–469, 2010.CrossRefGoogle Scholar
  20. 20.
    Cabena, P., Hadjinian, P., Stadler, R., Verhees, J., and Zanasi, A., Discovering data mining: from concept to implementation. Prentice Hall, New Jersey, 1997.Google Scholar
  21. 21.
    Kennedy, L., Lee, Y., Roy, V., Reed, C., and Lippman, R., Solving data mining problems through pattern recognition. Prentice Hall, New Jersey, 1997.Google Scholar
  22. 22.
    Quinlan, J. R., C4.5: programs for machine learning. Morgan Kaufmann Publishers, San Francisco, 1993.Google Scholar
  23. 23.
    Quinlan, J. R., Induction of decision trees. Mach. Learn. 1:81–106, 1986.Google Scholar
  24. 24.
    Tso, H. L. The application of data mining on the cardiovascular disease prediction. Dissertation, Southern Taiwan University of Science and Technology, 2005.Google Scholar
  25. 25.
    Ting, I. H., and Chen, M. Y., Data mining. Tsang Hai Book Publishing, Taiwan, 2005.Google Scholar
  26. 26.
    Jeng, C. C., Yang, I. C., Lain, T. J., Hsieh, K. L., and Lin, C. N., A methodology for constructing taxonomy trees and perceptual maps for microorganism classification. WSEAS Trans. Comput. 11:2571–2578, 2006.Google Scholar
  27. 27.
    Lin, C. N., Tsai, C. F., and Roan, J., Personal photo browsing and retrieval by clustering techniques: effectiveness and efficiency evaluation. Online Inf. Rev. 32:759–772, 2008.CrossRefGoogle Scholar
  28. 28.
    Hsieh, K. L., Jeng, C. C., Yang, I. C., Chen, Y. K., and Lin, C. N., The study of applying a systematic procedure based on SOFM clustering technique into organism clustering. Expert Syst. Appl. 33:330–336, 2007.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  • Wan-Ting Tseng
    • 1
  • Wei-Fan Chiang
    • 1
  • Shyun-Yeu Liu
    • 1
  • Jinsheng Roan
    • 2
  • Chun-Nan Lin
    • 3
  1. 1.Department of Oral Maxillofacial SurgeryChi-Mei Medical CenterTainan CityTaiwan Republic of China
  2. 2.Department of Information ManagementNational Chung Cheng UniversityChia-yi CountyTaiwan Republic of China
  3. 3.Department of Logistics ManagementShu-Te UniversityKaohsiung CityTaiwan Republic of China

Personalised recommendations