Knowledge Discovery in Lymphoma Cancer from Gene–Expression

  • Jesús S. Aguilar-Ruiz
  • Francisco Azuaje
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3177)


A comprehensive study of the database used in Alizadeh et al. [7], about the identification of lymphoma cancer subtypes within Diffuse Large B–Cell Lymphoma (DLBCL), is presented in this paper, focused on both the feature selection and classification tasks. Firstly, we tackle with the identification of relevant genes in the prediction of lymphoma cancer types, and lately the discovering of most relevant genes in the Activated B–Like Lymphoma and Germinal Centre B–Like Lymphoma subtypes within DLBCL. Afterwards, decision trees provide knowledge models to predict both types of lymphoma and subtypes within DLBCL. The main conclusion of our work is that the data may be insufficient to exactly predict lymphoma or even extract functionally relevant genes.


Decision Tree Feature Selection Acute Lymphocytic Leukemia Feature Selection Method Relevant Gene 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Alizadeh, A.A., Eisen, M., Botstain, D., Brown, P.O., Staudt, L.M.: Probing lymphocyte biology by genomic-scale gene expression analysis. Journal of Clinical Immunology (18), 373–379 (1998)Google Scholar
  2. 2.
    Han, J., Kamber, M.: Data Mining – Concepts and Techniques. Morgan Kaufmann, San Francisco (2001)Google Scholar
  3. 3.
    Liu, H., Motoda, H.: Feature Selection for Knowledge discovery and Data Mining. Kluwer Academic Publishers, Dordrecht (1998)zbMATHGoogle Scholar
  4. 4.
    Gowda, K.C., Krishna, G.: Agglomerative clustering using the concept of mutual nearest neighborhood. Pattern Recognition 10, 105–112 (1977)CrossRefGoogle Scholar
  5. 5.
    Gordon, D.: Classification. Chapman & Hall/CRC, Boca Raton (1999)zbMATHGoogle Scholar
  6. 6.
    Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and regression trees. Wadsworth International Group, Belmont (1984)zbMATHGoogle Scholar
  7. 7.
    Alizadeh, A.A., et al.: Distinct types of diffuse large b–cell lymphoma identified by gene expression profiling. Nature 403, 503–511 (2000)CrossRefGoogle Scholar
  8. 8.
    Harris, N.L., Jaffe, E.S., Diebold, J., Flandrin, G., Muller-Hermelink, H.K., Vardiman, J., Lister, T.A., Bloomfield, C.D.: World health organization classification of neoplastic diseases of the hematopoietic and lymphoid tissues: Report of the clinical advisory committee meeting–airlie house, virginia, November 1997. Journal of Clinical Oncology 17, 3835–3849 (1999)Google Scholar
  9. 9.
    Hall, M.A.: Correlation–based feature selection for machine learning, Ph.d., Department of Computer Science, University of Waikato, New Zealand (1998)Google Scholar
  10. 10.
    Kira, K., Rendell, L.: A practical approach to feature selection. In: Proceedings of the Ninth International Conference on Machine Learning, pp. 249–256 (1992)Google Scholar
  11. 11.
    Kononenko, I.: Estimating attributes: analysis and extensions of relief. In: Proceedings of European Conference on Machine Learning, Springer, Heidelberg (1994)Google Scholar
  12. 12.
    Liu, H., Setiono, R.: Chi2: Feature selection and discretization of numeric attributes. In: Proceedings of the Seventh IEEE International Conference on Tools with Artificial Intelligence (1995)Google Scholar
  13. 13.
    Witten, H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, San Francisco (2000)Google Scholar
  14. 14.
    Azuaje, F.: A computational neural approach to support discovery of gene function and classes of cancer. IEEE Transactions on Biomedical Engineering 48(3), 332–339 (2001)CrossRefGoogle Scholar
  15. 15.
    Holte, R.C.: Very simple classification rules perform well on most commonly used datasets. Machine learning 11, 63–91 (1993)zbMATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Jesús S. Aguilar-Ruiz
    • 1
  • Francisco Azuaje
    • 2
  1. 1.Department of Computer ScienceUniversity of SevilleSpain
  2. 2.School of Computing and MathematicsUniversity of Ulster 

Personalised recommendations