The graphical exploration of quantitative/qualitative data is an initial but essential step inmodern statistical data analysis.Matrix visualization (Chen, 2002; Chen et al., 2004) is a graphical technique that can simultaneously explore the associations between thousands of subjects, variables, and their interactions, without needing to first reduce the dimensions of the data. Matrix visualization involves permuting the rows and columns of the raw data matrix using suitable seriation (reordering) algorithms, together with the corresponding proximity matrices.The permuted raw data matrix and two proximity matrices are then displayed as matrix maps via suitable color spectra, and the subject clusters, variable groups, and interactions embedded in the dataset can be extracted visually.
KeywordsExploratory Data Analysis Proximity Measure Proximity Matrix Hierarchical Cluster Tree Binary Data Matrix
Unable to display preview. Download preview PDF.
- Bar-Joseph, Z., Gifford, D.K. and Jaakkola, T.S. (2001). Fast optimal leaf ordering for hierarchical clustering, Bioinformatics, 17:S22–S29.Google Scholar
- Bertin, J. (1967). Semiologie Graphique, Paris: Editions gauthier-Villars. English translation by William J. Berg. as Semiology of Graphics: Diagrams, Networks, Maps. University of Wisconsin Press, Madison, WI, 1983.Google Scholar
- Chang, S.C., Chen, C.H., Chi, Y.Y. and Ouyoung, C.W. (2002). Relativity and resolution for high dimensional information visualization with generalized association plots (GAP), Section for Invited Papers, Proceedings in Computational Statistics 2002 (Compstat 2002), Berlin, Germany, 55–66.Google Scholar
- Chen, C.H. (1996). The properties and applications of the convergence of correlation matrices. In 1996 Proceedings of the Section on Statistical Graphics of the American Statistical Association, 49–54.Google Scholar
- Chen, C.H. (1999). Extensions of generalized association plots, 1999 Proceedings of the Section on Statistical Graphics of the American Statistical Association, 111–116.Google Scholar
- Chen, C.H., Hwu, H.G., Jang, W.J., Kao, C.H., Tien, Y.J., Tzeng, S. and Wu, H.M. (2004). Matrix visualization and information mining, Proceedings in Computational Statistics 2004 (Compstat 2004), pp. 85–100, Physika Verlag, Heidelberg.Google Scholar
- Fisher, R.A. (1936). The use of multiple measurements in taxonomic problems, Annals of Eugenics, 7:179–188.Google Scholar
- Minnotte, M. and West, W. (1998). The data image: a tool for exploring high dimensional data sets, in Proceedings of the ASA Section on Statistical Graphics, Dallas, TX, 25–33.Google Scholar
- Streng, R. (1991). Classification and seriation by iterative reordering of a data matrix. In Classification, Data Analysis, and Knowledge Organization: Models and Methods with Applications (Edited by H.H. Bock and P. Ihm), 121-130. Springer, New York.Google Scholar
- Tibshirani, R., Hastie, T., Eisen, M., Ross, D., Botstein, D. and Brown, P. (1999). Clustering methods for the analysis of DNA microarray data. Technical Report, Stanford University, Oct. 1999.Google Scholar
- Tien, Y.J., Lee, Y.S, Wu, H.M. and Chen, C.H. (2006). Integration of clustering and visualization methods for simultaneously identifying coherent local clusters with smooth global patterns in gene expression profiles. Technical Report 2006-11, Institute of Statistical Science, Academia, Taiwan.Google Scholar
- Unwin, A.R and Hofmann, H. (1998). New interactive graphics tools for exploratory analysis of spatial data. In Innovations in GIS 5, ed. S Carver, pp. 46–55. Taylor & Francis, London.Google Scholar
- Wu, H.M. and Chen, C.H. (2005). Covariate adjusted matrix visualization. Technical Report. Institute of Statistical Science, Academia, Taiwan.Google Scholar