VIEW 2006: Pixelization Paradigm pp 48-54 | Cite as

Pixelisation-Based Statistical Visualisation for Categorical Datasets with Spreadsheet Software

  • Gaj Vidmar
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4370)

Abstract

A heat-map type of chart for depicting large number of cases and up to twenty-five categorical variables with spreadsheet software is presented. It is implemented in Microsoft® Excel using standard formulas, sorting and simple VBA code. The motivating example depicts accuracy of automated assignment of MeSH® descriptor headings to abstracts of medical articles. Within each abstract, predicted support for each heading is ranked, then for each heading actually assigned/non-assigned by human specialist (depicted by black/white cell), high/low support is depicted on nine-point two-colour scale. Thus, each case (abstract) is depicted by one row of a table and each variable (heading) with two adjacent columns. Rank-based classification accuracy measure is calculated for each case, and rows are sorted in increasing accuracy order downwards. Based on analogous measure, variables are sorted in increasing prediction accuracy order rightwards. Another biomedical dataset is presented with a similar chart. Different methods for predicting binary outcomes can be visualised, and the procedure is easily extended to polytomous variables.

Keywords

Automate Assignment Biomedical Informatics Medical Article Spreadsheet Software Adjacent Column 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Friendly, M.: Visualizing categorical data. Cary, NC (2000)Google Scholar
  2. 2.
    Bertin, J.: Graphics and graphic information-processing. de Gruyter, New York (1981)Google Scholar
  3. 3.
    Hartigan, J.A., Kleiner, B.: A mosaic of television ratings. The American Statistician 38(1), 32–35 (1984)CrossRefGoogle Scholar
  4. 4.
    Friendly, M.: Mosaic displays for multi-way contingency tables. Journal of the American Statistical Association 89(425), 190–200 (1994)CrossRefGoogle Scholar
  5. 5.
    Eisen, M.B., Spellman, P.T., Brown, P.O., Botstein, D.: Cluster analysis and display of genome-wide expression patterns. Proceedings of the National Academy of Sciences of the United States of America 8(95), 14863–14868 (1998)CrossRefGoogle Scholar
  6. 6.
    Pavlidis, P., Noble, W.S.: Matrix2png: a utility for visualizing matrix data. Bioinformatics 19(2), 295–296 (2003)CrossRefGoogle Scholar
  7. 7.
    Heiser, D.A.: Microsoft Excel, and 2003 faults, problems, workarounds and fixes. (2000), http://www.daheiser.info/excel/frontpage.html
  8. 8.
    Neuwirth, E., Arganbright, D.: The active modeler – mathematical modeling with Microsoft Excel. Brooks/Cole, Belmont (2004)Google Scholar
  9. 9.
    Lévy, P.P.: The case view, a generic method of visualization of the case mix. International Journal of Medical Informatics 73(9-10), 713–718 (2004)CrossRefGoogle Scholar
  10. 10.
    Lévy, P.P., Duché, L., Darago, L., Dorléans, Y., Toubiana, L., Vibert, J.-F., Flahault, A.: ICPCview: visualizing the International Classification of Primary Care. In: Engelbrecht, R., et al. (eds.) Connecting Medical Informatics and Bio-Informatics, Proceedings of MIE2005, pp. 623–628. IOS Press, Amsterdam (2005)Google Scholar
  11. 11.
    Zupancic Pridgar, A.: The influence of vaginal flora on morbidity after conization (MSc thesis). University of Ljubljana, Faculty of Medicine, Ljubljana (2003)Google Scholar
  12. 12.
    Džeroski, S., Hristovski, D., Peterlin, B.: Using data mining and OLAP to discover patterns in a database of patients with Y-chromosome deletions. Journal of the American Medical Informatics Association 7 (Suppl.), 215–219 (2000)Google Scholar
  13. 13.
    Wilkinson, L.: The grammar of graphics. Springer, New York (1999)MATHGoogle Scholar
  14. 14.
    Tufte, E.: The visual display of quantitative information (16th printing). Graphics Press, Chesire (1998)Google Scholar

Copyright information

© Springer Berlin Heidelberg 2007

Authors and Affiliations

  • Gaj Vidmar
    • 1
  1. 1.University of Ljubljana, Faculty of Medicine, Institute of Biomedical Informatics, Vrazov trg 2, SI-1000 LjubljanaSlovenia

Personalised recommendations