Skip to main content

Integrative Visual Data Mining of Biomedical Data: Investigating Cases in Chronic Fatigue Syndrome and Acute Lymphoblastic Leukaemia

  • Chapter
Visual Data Mining

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4404))

Abstract

This chapter presents an integrative visual data mining approach towards biomedical data. This approach and supporting methodology are presented at a high level. They combine in a consistent manner a set of visualisation and data mining techniques that operate over an integrated data set of several diverse components, including medical (clinical) data, patient outcome and interview data, corresponding gene expression and SNP data, domain ontologies and health management data. The practical application of the methodology and the specific data mining techniques engaged are demonstrated on two case studies focused on the biological mechanisms of two different types of diseases: Chronic Fatigue Syndrome and Acute Lymphoblastic Leukaemia, respectively. The common between the cases is the structure of the data sets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Parmigiani, G., Garrett, E.S., Irizarry, R.A., Zeger, S.L. (eds.): The Analysis of Gene Expression Data: Methods and Software. Springer, New York (2003)

    MATH  Google Scholar 

  2. Hoffman, E.P., Awad, T., Spira, A., Palma, J., Webster, T., Wright, G., Buckley, J., Davis, R., Hubbell, E., Jones, W., Tibshirani, R., Tompkins, R., Triche, T., Xiao, W., West, M., Warrington, J.A.: Expression profiling - best practices for data generation and interpretation in clinical trials. Nature Reviews: Genetics 4, 229–237 (2004)

    Article  Google Scholar 

  3. Piatetsky-Shapiro, G., Khabaza, T., Ramaswamy, S.: Capturing best practice for microarray gene expression data analysis. In: Proceedings of the 9-th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining KDD-2003, ACM Press, Washington, D.C. (2003)

    Google Scholar 

  4. Piatetsky-Shapiro, G., Tamayo, P.: Microarray data mining: Facing the challenges. SIGKDD Explorations 5(2), 1–5 (2003)

    Article  Google Scholar 

  5. Glenisson, P., Mathys, J., Moor, B.D.: Meta-clustering of gene expression data and literature-based information. SIGKDD Explorations 5(2), 101–112 (2003)

    Article  Google Scholar 

  6. Curran, M.D., Liu, H., Long, F., Ge, N.: Statistical methods for joint data mining of gene expression and DNA sequence database. SIGKDD Explorations 5(2), 122–129 (2003)

    Article  Google Scholar 

  7. Seifert, M., Scherf, M., Epple, A., Werner, T.: Multievidence microarray mining. Trends in Genetics 21(10), 553–558 (2005)

    Article  Google Scholar 

  8. Carmona-Saez, P., Chagoyen, M., Rodriguez, A., Trelles, O., Carazo, J.M., Pascual-Montano, A.: Integrated analysis of gene expression by association rules discovery. BMC Bioinformatics 7, 54–70 (2006)

    Article  Google Scholar 

  9. Georgii, E., Richter, L., Ruckert, U., Kramer, S.: Analyzing microarray data using quantitative association rules. Bioinformatics, 21(suppl. 2), 123–129 (2005)

    Google Scholar 

  10. Dietzsch, J., Gehlenborg, N., Nieselt, K.: Mayday-a microarray data analysis workbench. Bioinformatics 22(8), 1010–1012 (2006)

    Article  Google Scholar 

  11. Shamir, R., Maron-Katz, A., Tanay, A., Linhart, C., Steinfeld, I., Sharan, R., Shiloh, Y., Elkon, R.: EXPANDER – an integrative program suite for microarray data analysis. BMC Bioinformatics 6, 232–244 (2005)

    Article  Google Scholar 

  12. Hasegawa, Y., Seki, M., Mochizuki, Y., Heida, N., Hirosawa, K., Okamoto, N., Sakurai, T., Satou, M., Akiyama, K., Iida, K., Lee, K., Kanaya, S., Demura, T., Shinozaki, K., Konagaya, A., Toyoda, T.: A flexible representation of omic knowledge for thorough analysis of microarray data. Plant Methods 2(1), 5–46 (2006)

    Article  Google Scholar 

  13. Felix, C.A., Lange, B.J., Chessells, J.M.: Pediatric acute lymphoblastic leukemia: Challenges and controversies in 2000. In: Hematology 2000, January 2000, pp. 285–302 (2000)

    Google Scholar 

  14. Nelson, S.J., Powell, T., Humphreys, B.L.: The Unified Medical Language System (UMLS) project. In: Kent, A., Hall, C.M. (eds.) Encyclopedia of Library and Information Science, pp. 369–378. Marcel Dekker, Inc., New York (2002)

    Google Scholar 

  15. Weng, L., Dai, H., Zhan, Y., He, Y., Stepaniants, S.B., Bassett, D.E.: Rosetta error model for gene expression analysis. Bioinformatics 22(9), 1111–1121 (2006)

    Article  Google Scholar 

  16. Spellman, P.T., Miller, M., Stewart, J., Troup, C., Sarkans, U., Chervitz, S., Bernhart, D., Sherlock, G., Ball, C., Lepage, M., Swiatek, M., Marks, W.L., Goncalves, J., Markel, S., Iordan, D., Shojatalab, M., Pizarro, A., White, J., Hubley, R., Deutsch, E., Senger, M., Aronow, B.J., Robinson, A., Bassett, D., Stoeckert Jr., C.J., Brazma, A.: Design and implementation of microarray gene expression markup language (MAGE-ML). Genome Biology 3(9), 1–9 (2002)

    Article  Google Scholar 

  17. Aplenc, R., Lange, B.: Pharmacogenetic determinants of outcome in acute lymphoblastic leukaemia. British Journal of Haematology 125(4), 421–434 (2004)

    Article  Google Scholar 

  18. Goto, Y., Yue, L., Yokoi, A., Nishimura, R., Uehara, T., Koizumi, S., Saikawa, Y.: A novel single-nucleotide polymorphism in the 3’-untranslated region of the human dihydrofolate reductase gene with enhanced expression. Clinical Cancer Research 7, 1952–1956 (2001)

    Google Scholar 

  19. The Gene Ontology Consortium, Gene Ontology: tool for the unification of biology. Nature - Genetics 25, 25–29 (2000)

    Google Scholar 

  20. Afari, N., Buchwald, D.: Chronic Fatigue Syndrome: A review. American Journal of Psychiatry 160, 221–236 (2003)

    Article  Google Scholar 

  21. Reeves, W.C., Wagner, D., Nisenbaum, R., Jones, J.F., Gurbaxani, B., Solomon, L., Papanicolaou, D.A., Unger, E.R., Vernon, S.D., Heim, C.: Chronic Fatigue Syndrome - A clinically empirical approach to its definition and study. BMC Medicine 3(19) (2005)

    Google Scholar 

  22. CDC Chronic Fatigue Syndrome Research Group. CAMDA 2006 Conference Contest Datasets, viewed at January 12, 2008 (2006), http://www.camda.duke.edu/camda06/datasets/

  23. National Center for Infectious Diseases. Proposal: clinical assessment of subjects with Chronic Fatigue Syndrome and other fatiguing illnesses in Wichita (2006), ftp://ftp.camda.duke.edu/CAMDA06_DATASETS/wichita_clinical_irb_protocol.doc

  24. Shawe-Taylor, J., Cristianini, N.: Kernel Methods for Pattern Analysis, pp. 282–285. Cambridge University Press, Cambridge (2004)

    Google Scholar 

  25. Keerthi, S.S., Shevade, S.K., Bhattacharyya, C., Murthy, K.R.K.: Improvements to Platt’s SMO algorithm for SVM classifier design. Neural Computation 13, 637–649 (2001)

    Article  MATH  Google Scholar 

  26. Platt, J.: Fast training of support vector machines using sequential minimal optimization. In: Schölkopf, B., Burges, C., Smola, A. (eds.) Advances in Kernel Methods - Support Vector Learning, pp. 185–208. MIT Press, Boston (1998)

    Google Scholar 

  27. Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning. Springer, Heidelberg (2001)

    MATH  Google Scholar 

  28. Australian Institute of Health and Welfare (AIHW) & Australasian Association of Cancer Registries (AACR), Cancer in Australia, in AIHW cat. no. CAN 23. 2004: Canberra: AIHW (Cancer Series no. 28) (2001)

    Google Scholar 

  29. Henze, G., Fengler, R., Hartmann, R., Kornhuber, B., Janka-Schaub, G., Niethammer, D., Riehm, H.: Six-year experience with a comprehensive approach to the treatment of recurrent childhood acute lymphoblastic leukemia (ALL-REZ BFM 85). A relapse study of the BFM group. Blood 78(5), 1166–1172 (1991)

    Google Scholar 

  30. Sotiriou, C., Wirapati, P., Loi, S., Harris, A., Fox, S., Smeds, J., Nordgren, H., Farmer, P., Praz, V., Haibe-Kains, B., Desmedt, C., Larsimont, D., Cardoso, F., Peterse, H., Nuyten, D., Buyse, M., Van de Vijver, M.J., Bergh, J., Piccart, M., Delorenzi, M.: Gene expression profiling in breast cancer: Understanding the molecular basis of histologic grade to improve prognosis. Journal of the National Cancer Institute, 98(4), 262–272 (2006)

    Article  Google Scholar 

  31. Skillicorn, D.B., Simoff, S., Kennedy, P., Catchpoole, D.: Strategies for winnowing microarray data. In: Bioinformatics Workshop, SIAM International Conference on Data Mining 2004 (2004)

    Google Scholar 

  32. Kennedy, P., Simoff, S.J.: CONGO: Clustering on the Gene Ontology. In: Proceedings 2nd Australasian Data Mining Workshop, ADM 2003., UTS Press, Canberra (2003)

    Google Scholar 

  33. Kennedy, P.J., Simoff, S.J., Skillicorn, D., Catchpoole, D.: Extracting and explaining biological knowledge in microarray data. In: Dai, H., Srikant, R., Zhang, C. (eds.) PAKDD 2004. LNCS (LNAI), vol. 3056, Springer, Berlin/Heidelberg (2004)

    Google Scholar 

  34. Theodoridis, S., Koutroumbas, K.: Pattern Recognition. Academic Press, San Diego, USA (1999)

    Google Scholar 

  35. Lee, S.G., Hur, J.U., Kim, Y.,, S.: A graph-theoretic modeling on GO space for biological interpretation of gene clusters. Bioinformatics 20(3), 381–388 (2004)

    Article  Google Scholar 

  36. Vêncio, R.Z.N., Koide, T., Gomes, S.L., Pereira, C.A.d.B.: BayGO: Bayesian analysis of ontology term enrichment in microarray data. BMC Bioinformatics 7(1), 86–116 (2006)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Simeon J. Simoff Michael H. Böhlen Arturas Mazeika

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Kennedy, P., Simoff, S.J., Catchpoole, D.R., Skillicorn, D.B., Ubaudi, F., Al-Oqaily, A. (2008). Integrative Visual Data Mining of Biomedical Data: Investigating Cases in Chronic Fatigue Syndrome and Acute Lymphoblastic Leukaemia. In: Simoff, S.J., Böhlen, M.H., Mazeika, A. (eds) Visual Data Mining. Lecture Notes in Computer Science, vol 4404. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71080-6_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-71080-6_21

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-71079-0

  • Online ISBN: 978-3-540-71080-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics