Data Analysis of Microarrays Using SciCraft

  • Bjørn K. Alsberg
  • Lars Kirkhus
  • Truls Tangstad
  • Endre Anderssen
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3303)


SciCraft is a general open source data analysis tool which can be used in the analysis of microarrays. The main advantage of SciCraft is its ability to integrate different types of software through an intuitive and user friendly graphical interface. The user is able to control the flow of analysis and visualisation through a visual programming environment (VPE) where programs are drawn as diagrams. These diagrams consist of nodes and links where the nodes are methods or operators and the links are lines showing the flow of data between the nodes. The diagrammatic approach used in SciCraft is particularly suited to represent the various data analysis pipelines being used in the analysis of microarrays.

Efficient integration of methods from different computer languages and programs is accomplished through various plug-ins that handle all the necessary communication and data format handling. Currently available plug-ins are Octave (an open source Matlab clone), Python and R.


Open Source Gene Expression Data Partial Little Square Regression Partial Little Square Discriminant Analysis Module Diagram 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bioconductor: Open source software for bioinformatics,
  2. 2.
    Gentleman, R., Ihaka, R.: The R project for statistical computing,
  3. 3.
    Dalgaard, P.: Introductory Statistics with R. Springer, ISBN 0-387-95475-9 (2002)Google Scholar
  4. 4.
    Ihaka, R., Gentleman, R.: R: A language for data analysis and graphics. Journal of Computational and Graphical Statistics 5, 299–314 (1996)CrossRefGoogle Scholar
  5. 5.
    Eaton, J.W.: GNU Octave Manual. Network Theory Ltd. (2002)Google Scholar
  6. 6.
    Challet, D., Du, Y.L.: Closed source versus open source in a model of software bug dynamics. e-Print archive: Condensed Matter (2003),
  7. 7.
    Stallman, R.: GNU general Public License (2003),
  8. 8.
    Raymond, E.S.: The Cathedral and the Bazaar: Musings on Linux and Open Source by an Accidental Revolutionary. Revised edition edn. O’Reilly and Associates (2001)Google Scholar
  9. 9.
    Maubach, J., Drenth, W.: Data-flow oriented visual programming libraries for scientific computing. In: Sloot, P.M.A., Tan, C.J.K., Dongarra, J., Hoekstra, A.G. (eds.) ICCS-ComputSci 2002. LNCS, vol. 2329, pp. 429–438. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  10. 10.
    Takatsuka, M., Gahegan, M.: Geovista studio: a codeless visual programming environment for geoscientific data analysis and visualization. Comput. Geosci. 28(10), 1131–1144 (2002)CrossRefGoogle Scholar
  11. 11.
    Spinellis, D.: Unix tools as visual programming components in a gui-builder environment. Softw.-Pract. Exp. 32(1), 57–71 (2002)zbMATHCrossRefGoogle Scholar
  12. 12.
    Acacio, M., Canovas, O., Garcia, J., Lopez-de Teruel, P.: Mpi-delphi: an MPI implementation for visual programming environments and heterogeneous computing. Futur. Gener. Comp. Syst. 18(3), 317–333 (2002)CrossRefGoogle Scholar
  13. 13.
    The Python Project (2003),
  14. 14.
    Beazley, D.M., Rossum, G.V.: Python Essential Reference, 2nd edn., Que (2001)Google Scholar
  15. 15.
    Rempt, B.: GUI Programming With Python: Using the Qt Toolkit. Book and CD-rom edn., Opendocs Llc (2002)Google Scholar
  16. 16.
    Dalheimer, M.K.: Programming with Qt, 2nd edn. O’Reilly and Associate, Sebastopol (2002)Google Scholar
  17. 17.
    Trolltech, A.S. (2003),
  18. 18.
    Nash, D.: The KDE Bible. Book and CD-rom edn., Hungry Minds, Inc. (2000)Google Scholar
  19. 19.
    Schroeder, W., Martin, K., Lorensen, B.: The Visualization Toolkit: An Object Oriented Approach to 3D Graphics, 3rd edn. Kitware, Inc.Google Scholar
  20. 20.
    Kitware Inc. (2003),
  21. 21.
    Rathmann, U., Vermeulen, G., Bieber, M., Dennington, R., Wilgen, J.: Qwt - Qt Widgets for technical applications,
  22. 22.
    Vermeulen, G., Colclough, M.: PyQwt plots data with numerical python and PyQt,
  23. 23.
    Wang, J., Nygaard, V., Smith-Sorensen, B., Hovig, E., Myklebost, O.: Marray: analysing single, replicated or reversed microarray experiments. Bioinformatics 18, 1139–1140 (2002)CrossRefGoogle Scholar
  24. 24.
    Wang, J., Myklebost, O., Hovig, E.: MGraph: Graphical models for microarray data analysis,
  25. 25.
  26. 26.
    Venet, D.: MatArray: a matlab toolbox for microarray data. Bioinformatics 19, 659–660 (2003)CrossRefGoogle Scholar
  27. 27.
  28. 28.
    Nabney, I.: Netlab: Algorithms for pattern recognition. Springer, Heidelberg (2004)Google Scholar
  29. 29.
    Stork, D., Yom-Tov, E.: Computer Manual in MATLAB to Accompany Pattern Classification, 2nd edn. Wiley Interscience, Hoboken (2004)Google Scholar
  30. 30.
    Duda, R., Hart, P., Stork, D.: Pattern Classification, 2nd edn. Wiley Interscience, Hoboken (2002)Google Scholar
  31. 31.
  32. 32.
    Cawley, G.C.: MATLAB support vector machine toolbox. University of East Anglia, School of Information Systems, Norwich, Norfolk, U.K. NR4 7TJ (2000),
  33. 33.
    The Comprehensive R Archive Network,
  34. 34.
    Oliphant, T., Peterson, P., Jones, E.: SciPy - scientific tools for Python,
  35. 35.
    The BioPython Project,
  36. 36.
    de Hoon, M., Imoto, S., Nolan, J., Miyano, S.: Open source clustering software. Bioinformatics 20, 1453–1454 (2004)CrossRefGoogle Scholar
  37. 37.
  38. 38.
    Alon, U., Barkai, N., Notterman, D., Gish, K., Ybarra, S., Mack, D., Levine, A.: Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. In: Proceedings of the National Academy of Sciences of the United States of America, vol. 96, pp. 6745–6750 (1999)Google Scholar
  39. 39.
    Alon, U., Barkai, N., Notterman, D., Gish, K., Ybarra, S., Mack, D., Levine, A.:
  40. 40.
    Massart, D., Vandeginste, B.G.M., Buydens, L., Jong, S., Lewi, P., Verbeke-Smeyers, J.: Handbook of Chemometrics and Qualimetrics: Part A and B. Elsevier Science, Amsterdam (1997)Google Scholar
  41. 41.
    Martens, H., Naes, T.: Multivariate Calibration. John Wiley & Sons, New York (1989)zbMATHGoogle Scholar
  42. 42.
    Datta, S.: Exploring relationships in gene expressions: A partial least squares approach. Gene expression 9(6), 249–255 (2001)Google Scholar
  43. 43.
    Barra, V.: Analysis of gene expression data using functional principal components. Computer methods and programs in biomedicine 75(1), 1–9 (2004)CrossRefGoogle Scholar
  44. 44.
    Ghosh, D.: Penalized discriminant methods for the classification of tumors from gene expression data. Biometrics 59(4), 992–1000 (2003)zbMATHCrossRefMathSciNetGoogle Scholar
  45. 45.
    Wouters, L., Gohlmann, H., Bijnens, L., Kass, S., Molenberghs, G., Lewi, P.: Graphical exploration of gene expression data: A comparative study of three multivariate methods. Biometrics 59(4), 1131–1139 (2003)zbMATHCrossRefMathSciNetGoogle Scholar
  46. 46.
    Conde, L., Mateos, A., Herrero, J., Dopazo, J.: Improved class prediction in DNA microarray gene expression data by unsupervised reduction of the dimensionality followed by supervised learning with a perceptron. Journal of VLSI signal processing systems for signal image and videotechnology 35(3), 245–253 (2003)zbMATHCrossRefGoogle Scholar
  47. 47.
    Wang, Z., Wang, Y., Lu, J., Kung, S., Zhang, J., Lee, R., Xuan, J., Khan, J.: Discriminatory mining of gene expression microarray data. Journal of VLSI signal processing systems for signal image and videotechnology 35(3), 255–272 (2003)zbMATHCrossRefGoogle Scholar
  48. 48.
    Bicciato, S., Luchini, A., Di Bello, C.: PCA disjoint models for multiclass cancer analysis using gene expression data. Bioinformatics 19(5), 571–578 (2003)CrossRefGoogle Scholar
  49. 49.
    Bicciato, S., Luchini, A., Di Bello, C.: Disjoint PCA models for marker identification and classification of cancer types using gene expression data. Minerva biotecnologica 14(3-4), 281–290 (2002)Google Scholar
  50. 50.
    Nguyen, D., Rocke, D.: Multi-class cancer classification via partial least squares with gene expression profiles. Bioinformatics 18(9), 1216–1226 (2002)CrossRefGoogle Scholar
  51. 51.
    Mendez, M., Hodar, C., Vulpe, C., Gonzalez, M., Cambiazo, V.: Discriminant analysis to evaluate clustering of gene expression data. FEBS letters 522(1-3), 24–28 (2002)CrossRefGoogle Scholar
  52. 52.
    Nguyen, D., Rocke, D.: Tumor classification by partial least squares using microarray gene expression data. Bioinformatics 18(1), 39–50 (2002)CrossRefGoogle Scholar
  53. 53.
    Chapman, S., Schenk, P., Kazan, K., Manners, J.: Using biplots to interpret gene expression patterns in plants. Bioinformatics 18(1), 202–204 (2002)CrossRefGoogle Scholar
  54. 54.
    Perez-Enciso, M., Tenenhaus, M.: Prediction of clinical outcome with microarray data: a partial least squares discriminant analysis (PLS-DA) approach. Human genetics 112(5-6), 581–592 (2003)Google Scholar
  55. 55.
    Alsberg, B.K., Kell, D.B., Goodacre, R.: Variable selection in discriminant partial least squares analysis. Analytical Chemistry 70, 4126–4133 (1998)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Bjørn K. Alsberg
    • 1
  • Lars Kirkhus
    • 1
  • Truls Tangstad
    • 1
  • Endre Anderssen
    • 1
  1. 1.The Chemometrics and Bioinformatics Group (CBG), Department of ChemistryNorwegian University of Science and Technology (NTNU), Division of Physical Chemistry, RealfagbyggetTrondheim

Personalised recommendations