Abstract
Artificial Intelligence (AI) approaches for medical diagnosis and prediction of cancer are important and ever growing areas of research. Artificial Neural Networks (ANN) is one such approach that have been successfully applied in these areas. Various types of clinical datasets have been used in intelligent decision making systems for medical diagnosis, especially cancer for over three decades. However, gene expression datasets are complex with large numbers of attributes which make it more difficult for AI approaches to classification and prediction. Prostate Cancer dataset is one such dataset with 12600 attributes and only 102 samples. In this paper, we propose an extended ANN based approach for classification and prediction of prostate cancer using gene expression data. Firstly, we use four attribute selection approaches, namely Sequential Floating Forward Selection (SFFS), RELIEFF, Sequential Backward Feature Section (SFBS) and Significant Attribute Evaluation (SAE) to identify the most influential attributes among 12600. We use ANNs and Naive Bayes for classification with complete sets of attributes as well as various sets obtained from attribute selection methods. Experimental results show that ANN outperformed Naive Bayes by achieving a classification accuracy of 98.2 % compared to 62.74 % with the full set of attributes. Further, with 21 selected attributes obtained with SFFS, ANNs achieved better accuracy (100 %) for classification compared to Naive Bayes. For prediction using ANNs, SFFS was able achieve best results with 92.31 % of accuracy by correctly predicting 24 out of 26 samples provided for independent sample testing. Moreover, some of the gene selected by SFFS are identified to have a direct reference to cancer and tumour. Our results indicate that a combination of standard feature selection methods in conjunction with ANNs provide the most impressive results.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Narayanan, A., Keedwell, E., Olsson, B.: Artificial intelligence techniques for bioinformatics. Appl. Bioinf. 1, 191–222 (2002)
Baker, J.A., Kornguth, P.J., Lo, J.Y., Williford, M.E., Floyd Jr., C.E.: Breast cancer: prediction with artificial neural network based on bi-radsstandardized lexicon. Radiology 196(3), 817–822 (1995)
Bottaci, L., Drew, P.J., Hartley, J.E., Hadfield, M.B., Farouk, R., Lee, P.W., Macintyre, I.M., Duthie, G.S., Monson, J.R.: Artificial neural networks applied to outcome prediction for colorectal cancer patients in separate institutions. Lancet 350(9076), 469–472 (1997)
Zhou, Z.-H., Jiang, Y., Yang, Y.-B., Chen, S.-F.: Lung cancer cell identification based on artificial neural network ensembles. Artif. Intell. Med. 24(1), 25–36 (2002)
Ahmed, F.E.: Artificial neural networks for diagnosis and survival prediction in colon cancer. Mol. Cancer 4(1), 29 (2005)
Khan, J., Wei, J.S., Ringner, M., Saal, L.H., Ladanyi, M., Westermann, F., Berthold, F., Schwab, M., Antonescu, C.R., Peterson, C., et al.: Classification and diagnostic prediction of cancers using gene expressionprofiling and artificial neural networks. Nat. Med. 7(6), 673–679 (2001)
Narayanan, A., Keedwell, E.C., Gamalielsson, J., Tatineni, S.: Single-layer artificial neural networks for gene expression analysis. Neurocomputing 61, 217–240 (2004)
Snow, P.B., Smith, D.S., Catalona, W.J.: Artificial neural networks in the diagnosis and prognosis of prostate cancer: a pilot study. J. Urol. 152(5 Pt 2), 1923–1926 (1994)
Djavan, B., Remzi, M., Zlotta, A., Seitz, C., Snow, P., Marberger, M.: Novel artificial neural network for early detection of prostate cancer. J. Clin. Oncol. 20(4), 921–929 (2002)
Singh, D., Febbo, P.G., Ross, K., Jackson, D.G., Manola, J., Ladd, C., Tamayo, P., Renshaw, A.A., D’Amico, A.V., Richie, J.P., Lander, E.S., Loda, M., Kantoff, P.W., Golub, T.R., Sellers, W.R.: Gene expression correlates of clinical prostate cancer behavior. Cancer Cell 1, 203–209 (2002)
Fakoor, R., Ladhak, F., Nazi, A., Huber, M.: Using deep learning to enhance cancer diagnosis and classification. In: Proceedings of the International Conference on Machine Learning (2013)
Pudil, P., Novovičová, J., Kittler, J.: Floating search methods in feature selection. Pattern Recogn. Lett. 15, 1119–1125 (1994)
Ververidis, D., Kotropoulos, C.: Sequential forward feature selection with low computational cost. In: 2005 13th European Signal Processing Conference, pp. 1–4. IEEE (2005)
Kononenko, I., Simec, E., Robnik-Sikonja, M.: Overcoming the myopia of inductive learning algorithms with RELIEFF. Appl. Intell. 7, 39–55 (1997)
Ahmad, A., Dey, L.: A feature selection technique for classificatory analysis. Pattern Recogn. Lett. 26(1), 43–56 (2005)
Tirumala, S.S., Narayanan, A.: Hierarchical data classification using deep neural networks. In: Arik, S., Huang, T., Lai, W.K., Liu, Q. (eds.) ICONIP 2015. LNCS, vol. 9489, pp. 492–500. Springer, Heidelberg (2015). doi:10.1007/978-3-319-26532-2_54
Li, Y., Graham, C., Lacy, S., Duncan, A., Whyte, P.: The adenovirus E1A-associated 130-kD protein is encoded by a member of the retinoblastoma gene family and physically interacts with cyclins A and E. Genes Dev. 7(12a), 2366–2377 (1993)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Tirumala, S.S., Narayanan, A. (2016). Attribute Selection and Classification of Prostate Cancer Gene Expression Data Using Artificial Neural Networks. In: Cao, H., Li, J., Wang, R. (eds) Trends and Applications in Knowledge Discovery and Data Mining. PAKDD 2016. Lecture Notes in Computer Science(), vol 9794. Springer, Cham. https://doi.org/10.1007/978-3-319-42996-0_3
Download citation
DOI: https://doi.org/10.1007/978-3-319-42996-0_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-42995-3
Online ISBN: 978-3-319-42996-0
eBook Packages: Computer ScienceComputer Science (R0)