Structural Representation of Categorical Data and Cluster Analysis Through Filters

Conference paper
Part of the Studies in Classification, Data Analysis, and Knowledge Organization book series (STUDIES CLASS)


Representation of categorical data by nominal measurement leaves the entire information intact, which is not the case with widely used numerical or pseudo-numerical representation such as Likert-type scoring. This aspect is first explained, and then we turn our attention to the analysis of nominally represented data. For the analysis of a large number of variables, one typically resorts to dimension reduction, and its necessity is often greater with categorical data than with continuous data. In spite of this, Nishisato S, Clavel JG (Behaviormetrika 57:15–32, 2010) proposed an approach which is diametrically opposite to the dimension-reduction approach, for they advocate the use of doubled hyper-space to accommodate both row variables and column variables of two-way data in a common space. The rationale of doubled space can be used to vindicate the validity of the Carroll-Green-Schaffer scaling (Carroll JD, Green PE, Schaffer CM (1986) J Mark Res 23(3):271–280). The current paper will then introduce a simple procedure for the analysis of a hyper-dimensional configuration of data, called cluster analysis through filters. A numerical example will be presented to show a clear contrast between the dimension-reduction approach and the total information analysis by cluster analysis. There is no doubt that our approach is preferred to the dimension-reduction approach on two grounds: our results are a factual summary of a multidimensional data configuration, and our procedure is simple and practical.



Thanks are due to José Garcia Clavel for the calculation of between-set distances of Heuer’s data.


  1. Carroll JD, Green PE, Schaffer CM (1986) Interpoint distance comparisons in correspondence analysis. J Mark Res 23(3):271–280CrossRefGoogle Scholar
  2. Greenacre MJ (1989) The carroll-green-schaffer scaling in correspondence analysis: a theoretical and empirical appraisal. J Mark Res 26(3):358–365MathSciNetCrossRefGoogle Scholar
  3. Heuer G (1979) Selbstmord bei Kindern und Jugendlichen: ein Beitrag zur Suizidprophylaxe aus pädagogischer Sicht. Klett-Cotta, StuttgartGoogle Scholar
  4. Likert R (1932) A technique for the measurement of attitudes. Arch Psychol 22(140):44–53Google Scholar
  5. Nishisato S (1980) Analysis of categorical data: dual scaling and its applications. University of Toronto Press, TorontoMATHGoogle Scholar
  6. Nishisato S (1984) Forced classification: a simple application of a quantification method. Psychometrika 49:25–36CrossRefGoogle Scholar
  7. Nishisato S (1994) Elements of dual scaling: an introduction to practical data analysis. Lawrence Erlbaum Associates, HillsdaleGoogle Scholar
  8. Nishisato S (1999) Data types and information: beyond the current practice of data analysis. In: Decker R, Gaul W (eds) Classification and information processing at the turn of the Millennium. Springer, Berlin/Heidelberg, pp 40–51Google Scholar
  9. Nishisato S (2006) Correlational structure of multiple-choice data as viewed from dual scaling. In: Greenacre MJ, Blasius I (eds) Multiple correspondence analysis and related methods. Chapman and Hall/CRC, Boca Raton, chap 6, pp 161–178CrossRefGoogle Scholar
  10. Nishisato S (2007) Multidimensional nonlinear descriptive analysis. Chapman and Hall/CRC, Boca RatonMATHGoogle Scholar
  11. Nishisato S (2012a) Optimal quantities for analysis through regression of measurement on data. Bull Data Anal Jpn Classif Soc 1:1–10Google Scholar
  12. Nishisato S (2012b) Reminiscence and a step forward. In: Gaul W, Geyer-Schultz A, Schmidt-Thieme L, Kunze J (eds) Classification, data analysis, and knowledge organization. Springer, Heidelberg, pp 109–119Google Scholar
  13. Nishisato S, Baba Y (1999) On contingency, projection and forced classification of dual scaling. Behaviormetrika 26:207–219CrossRefGoogle Scholar
  14. Nishisato S, Clavel JG (2003) A note on between-set distances in dual scaling and correspondence analysis. Behaviormetrika 30(1):87–98MathSciNetCrossRefMATHGoogle Scholar
  15. Nishisato S, Clavel JG (2010) Total information analysis: comprehensive dual scaling. Behaviormetrika 57:15–32Google Scholar
  16. Van der Heijden PGM, De Leeuw J (1985) Correspondence analysis used complementary to loglinear analysis. Psychometrika 50(4):429–447MathSciNetCrossRefMATHGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  1. 1.University of TorontoTorontoCanada

Personalised recommendations