A Comparative Machine Learning Algorithm to Predict the Bone Metastasis Cervical Cancer with Imbalance Data Problem
Abstract
This paper attempted to develop and validate a tool to predict the immediate results of radiation on bone metastasis in cervical cancer cases. Cases of bone metastasis in cervical cancer are based on radiation treatment data, which is imbalanced. This imbalanced data is a challenge among the researchers in data mining, called class imbalance learning (CIL) and has lead to difficulties in machine learning and a reduction in the classifier performance. In this paper, we compared several algorithms to deal with the data imbalance classification problem using the synthetic minority over-sampling technique (SMOTE) used to drive classification models: Ant-Miner, RIPPER, Ridor, PART, ADTree, C4.5, ELM and Weighted ELM using Accuracy, G-mean and F-measure to evaluate performance. The results of this paper show that the RIPPER algorithm outperformed the other algorithms in Accuracy and F-measure, but weighted ELM outperformed other algorithms by G-mean. This may be useful when evaluating clinical assessments.
Keywords
cervical cancer classification algorithm radiotherapy imbalance data machine learning metastasisPreview
Unable to display preview. Download preview PDF.
References
- 1.Nartthanarung, A., Thanapprapasr, D.: Comparison of Outcomes for Patients With Cervical Cancer Who Developed Bone Metastasis After the Primary Treatment With Concurrent Chemoradiation Versus Radiation Therapy Alone. Int. J. Gynecol. Cancer 20(8), 1386–1390 (2010)Google Scholar
- 2.Thanapprapasr, D., Nartthanarung, A., Likittanasombut, P., Na Ayudhya, N.I., Charakorn, C., Udomsubpayakul, U., Subhadarbandhu, T., Wilailak, S.: Bone Metastasis in Cervical Cancer Patients over a 10-Year Period. Int. J. Gynecol. Cancer 20(3), 373–378 (2010)CrossRefGoogle Scholar
- 3.Kamsa-ard, S., Tangvorapongchai, V., Krusun, S., Sriamporn, S., Suwanrungruang, K., Mahaweerawat, S., Pomros, P.: A model to predict the immediate results of radiation on cervix cancer. KKU Res. J. 13(7), 851–865 (2008)Google Scholar
- 4.Ochi, T., Murase, K., Fujii, T., Kawamura, M., Ikezoe, J.: Survival prediction using artificial neural networks in patients with uterine cervical cancer treated by radiation therapy alone. Int. J. Clin. Oncol. 7(5), 294–300 (2002)Google Scholar
- 5.Tomek, I.: An experiment with the edited nearest-neighbor rule. IEEE Transactions on Systems, Man, and Cybernetics SMC-6(6), 448–452 (1976)MathSciNetCrossRefGoogle Scholar
- 6.Kubat, M., Matwin, S.: Addressing the curse of imbalanced training sets: one-sided selection. In: Fisher, D.H. (ed.) ICML, vol. 97, pp. 179–186. Morgan Kaufmann (1997)Google Scholar
- 7.Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. J. Artif. Int. Res. 16(1), 321–357 (2002)MATHGoogle Scholar
- 8.Chawla, N.V., Japkowicz, N., Kotcz, A.: Editorial: special issue on learning from imbalanced data sets. ACM SIGKDD Explorations Newsletter 6(1), 1–6 (2004)CrossRefGoogle Scholar
- 9.Soto, C.: Model for cervical cancer result prediction. (Ms.D. Thesis in Computer Science). Department of Computer Science, Khon Kaen University, Thailand (2013)Google Scholar
- 10.Parpinelli, R.S., Lopes, H.S., Freitas, A.A.: Data mining with an ant colony optimization algorithm. IEEE Transactions on Evolutionary Computation 6(4), 321–332 (2002)CrossRefGoogle Scholar
- 11.Cohen, W.W.: Fast effective rule induction. In: Proceedings of the Twelfth International Conference on Machine Learning, pp. 115–123. Morgan Kaufmann (1995)CrossRefGoogle Scholar
- 12.Gaines, B.R., Compton, P.: Induction of ripple-down rules applied to modeling large databases. J. Intell. Inf. Syst. 5(3), 211–228 (1995)CrossRefGoogle Scholar
- 13.Frank, E., Witten, I.H.: Generating accurate rule sets without global optimization. In: Shavlik, J.W. (ed.) Proceedings of the Fifteenth International Conference on Machine Learning, ICML 1998, pp. 144–151. Morgan Kaufmann Publishers Inc., San Francisco (1998)Google Scholar
- 14.Freund, Y., Mason, L.: The alternating decision tree learning algorithm. In: Bratko, I., Dzeroski, S. (eds.) Proceedings of the Sixteenth International Conference on Machine Learning (ICML 1999), pp. 124–133. Morgan Kaufmann Publishers Inc., San Francisco (1999)Google Scholar
- 15.Quinlan, J.R.: C4. 5: programs for machine learning. Morgan Kaufmann Publishers Inc., San Francisco (1993)Google Scholar
- 16.Huang, G.B., Wang, D.H., Lan, Y.: Extreme learning machines: a survey. Int. J. Mach. Learn. & Cyber. 2(2), 107–122 (2011)CrossRefGoogle Scholar
- 17.Zong, W., Huang, G.B., Chen, Y.: Weighted extreme learning machine for imbalance learning. J. Neurocomput. 101, 229–242 (2013)CrossRefGoogle Scholar
- 18.Ganganwar, V.: An overview of classification algorithms for imbalanced datasets. International Journal of Emerging Technology and Advanced Engineering 2(4), 42–47 (2012)Google Scholar
- 19.Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)MATHGoogle Scholar