Advertisement

Advances in Data Analysis and Classification

, Volume 3, Issue 2, pp 169–184 | Cite as

On Robinsonian dissimilarities, the consecutive ones property and latent variable models

  • Matthijs J. Warrens
Open Access
Regular Article

Abstract

A dissimilarity measure on a set of objects is Robinsonian if its matrix can be symmetrically permuted so that its elements do not decrease when moving away from the main diagonal along any row or column. The Robinson property of a dissimilarity reflects an order of the objects. If a dissimilarity is not observed directly, it must be obtained from the data. Given that an ordinal structure is assumed to underlie the data, the dissimilarity function of choice may or may not recover the order correctly. For four dissimilarity measures for binary data it is investigated what ordinal data structure of 0s and 1s is correctly recovered. We derive sufficient conditions for the dissimilarity functions to be Robinsonian. The sufficient conditions differ with the dissimilarity measures. The paper concludes with some limitations of the study.

Keywords

Dissimilarity measures Binary data Ordinal comparison Pyramids Ordered clustering systems Weakly pseudo-hierarchies 

Mathematics Subject Classification (2000)

62H05 62H20 

Notes

Acknowledgments

The author would like to thank Hans-Hermann Bock, Maurizio Vichi and two anonymous reviewers for their helpful comments and valuable suggestions on earlier versions of this article.

Open Access

This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.

References

  1. Albatineh AN, Niewiadomska-Bugaj M, Mihalko D (2006) On similarity indices and correction for chance agreement. J Class 23: 301–313CrossRefMathSciNetGoogle Scholar
  2. Andrich D (1988) The application of an unfolding model of the PIRT type to the measurement of attitude. Appl Psychol Meas 12: 33–51CrossRefGoogle Scholar
  3. Andrich D, Luo G (1993) A hyperbolic cosine latent trait model for unfolding dichotomous single-stimulus responses. Appl Psychol Meas 17: 253–276CrossRefGoogle Scholar
  4. Barthélemy J-P, Brucker F, Osswald C (2004) Combinatorial optimization and hierarchical classifications. 4OR 2: 179–219MATHCrossRefMathSciNetGoogle Scholar
  5. Batagelj V, Bren M (1995) Comparing resemblance measures. J Class 12: 73–90MATHCrossRefMathSciNetGoogle Scholar
  6. Baulieu FB (1989) A classification of presence/absence based dissimilarity coefficients. J Class 6: 233–246MATHCrossRefMathSciNetGoogle Scholar
  7. Bertrand P, Diday E (1985) A visual representation of the compatibility between an order and a dissimilarity index: the pyramids. Comput Stat Q 2: 31–44MATHGoogle Scholar
  8. Booth KS, Lueker GE (1976) Testing for the consecutive ones property, interval graphs, and graph planarity using PQ-tree algorithms. J Comput Syst Sci 13: 335–379MATHMathSciNetGoogle Scholar
  9. Braun-Blanquet J (1932) Plant sociology: the study of plant communities (authorized English translation of Pflanzensoziologie). McGraw-Hill, New YorkGoogle Scholar
  10. Chepoi V, Fichet B (1997) Recognition of Robinsonian dissimilarities. J Class 14: 311–325MATHCrossRefMathSciNetGoogle Scholar
  11. Coombs CH (1964) A theory of data. Wiley, New YorkGoogle Scholar
  12. Critchley F, Fichet B (1994) The partial order by inclusion of the principal classes of dissimilarity on a finite set, and some of their basic properties. In: Van Cutsem B (eds) Classification and dissimilarity analysis. Springer, New York, pp 5–65Google Scholar
  13. Dice LR (1945) Measures of the amount of ecologic association between species. Ecology 26: 297–302CrossRefGoogle Scholar
  14. Diday E (1984) Une représentation visuelle des classes empiétantes: les pyramides. INRIA, research report 291Google Scholar
  15. Diday E (1986) Orders and overlapping clusters in pyramids. In: De Leeuw J, Heiser WJ, Meulman JJ, Critchley F (eds) Multidimensional data analysis. DSWO Press, Leiden, pp 201–234Google Scholar
  16. Fichet B (1984) Sur une extension de la notion de hiérarchie et son équivalence avec quelques matrices de Robinson. Actes des Journées de Statistique de la Grande Motte 12–12Google Scholar
  17. Formann AK (1988) Latent class models for non-monotone dichotomous items. Psychometrika 53: 45–62MATHCrossRefGoogle Scholar
  18. Fulkerson DR, Gross OA (1965) Incidence matrices and interval graphs. Pac J Math 15: 835–855MATHMathSciNetGoogle Scholar
  19. Ghosh SP (1972) File organization: the consecutive retrieval property. Commun ACM 15: 802–808MATHCrossRefGoogle Scholar
  20. Gower JC (1986) Euclidean distance matrices. In: De Leeuw J, Heiser WJ, Meulman JJ, Critchley F (eds) Multidimensional data analysis. DSWO Press, Leiden, pp 11–22Google Scholar
  21. Gower JC, Legendre P (1986) Metric and Euclidean properties of dissimilarity coefficients. J Class 3: 5–48MATHCrossRefMathSciNetGoogle Scholar
  22. Greenberg DS, Istrail S (1995) Physical mapping by STS hybridization: algorithmic strategies and the challenge of software evaluation. J Comput Biol 2: 219–273CrossRefGoogle Scholar
  23. Hoijtink H (1990) A latent trait model for dichotomous choice data. Psychometrika 55: 641–656CrossRefMathSciNetGoogle Scholar
  24. Hoijtink H (1991) Parella. Measurement of latent traits by proximity items. DSWO Press, LeidenGoogle Scholar
  25. Hsu W-L (2002) A simple test for the consecutive ones property. J Algorithms 43: 1–16MATHCrossRefMathSciNetGoogle Scholar
  26. Hubert L (1974) Some applications of graph theory and related nonmetric techniques to problems of approximate seriation: the case of symmetric proximity measures. Br J Math Stat Psychol 27: 133–153MATHGoogle Scholar
  27. Hubert L, Ararbie P, Meulman J (1998) Graph-theoretic representations for proximity matrices through strongly-anti-Robinsonian or circular strongly-anti-Robinsonian matrices. Psychometrika 63: 341–358CrossRefGoogle Scholar
  28. Jaccard P (1912) The distribution of the flora in the Alpine zone. New Phytol 11: 37–50CrossRefGoogle Scholar
  29. Karlin S (1968) Total positivity I. Stanford Univeristy Press, StanfordMATHGoogle Scholar
  30. Kendall DG (1969) Incidence matrices, interval graphs and seriation in archaeology. Pac J Math 28: 565–570MATHMathSciNetGoogle Scholar
  31. Kendall DG (1971) Seriation from abundance matrices. In: Hodson FR, Kendall DG, Tautu P (eds) Mathematics in the archaeological and historical sciences. University Press, Edinburgh, pp 215–252Google Scholar
  32. Lazarsfeld PF, Henry NW (1968) Latent structure analysis. Mifflin, HoughtonMATHGoogle Scholar
  33. Meidanis J, Porto O, Telles GP (1998) On the consecutive ones property. Discrete Appl Math 88: 325–354MATHCrossRefMathSciNetGoogle Scholar
  34. Mirkin B (1996) Mathematical classification and clustering. Kluwer, DordrechtMATHGoogle Scholar
  35. Mirkin B, Rodin S (1984) Graphs and genes. Springer, BerlinMATHGoogle Scholar
  36. Post WJ (1992) Nonparametric unfolding models, a latent structure approach. DSWO Press, LeidenMATHGoogle Scholar
  37. Post WJ, Snijders TAB (1993) Nonparametric unfolding models for dichotomous data. Sonderdruck Methodika 7: 130–156Google Scholar
  38. Rasch G (1960) Probabilistic models for some intelligence and attainment tests. Studies in mathematical psychology I. Danish Institute for Educational Research, CopenhagenGoogle Scholar
  39. Robinson WS (1951) A method for chronologically ordering archaeological deposits. Am Antiquity 16: 293–301CrossRefGoogle Scholar
  40. Russel PF, Rao TR (1940) On habitat and association of species of anopheline larvae in South-Eastern Madras. J Malaria Instit India 3: 153–178Google Scholar
  41. Schriever BF (1986) Multiple correspondence analysis and ordered latent structure models. Kwantitatieve Methoden 21: 117–131Google Scholar
  42. Sibson R (1972) Order invariant methods for data analysis. J R S Soc Ser B 34: 311–349MATHMathSciNetGoogle Scholar
  43. Sijtsma K, Molenaar IW (2002) Introduction to nonparametric item response theory. Sage, Thousand OaksMATHGoogle Scholar
  44. Sokal RR, Michener CD (1958) A statistical method for evaluating systematic relationships. Univ Kansas Science Bull 38: 1409–1438Google Scholar
  45. Van der Linden WJ, Hambleton RK (eds) (1997) Handbook of modern item response theory. Springer, New YorkMATHGoogle Scholar
  46. Warrens MJ (2008a) On similarity coefficients for 2 × 2 tables and correction for chance. Psychometrika 73: 487–502CrossRefGoogle Scholar
  47. Warrens MJ (2008b) On association coefficients for 2 × 2 tables and properties that do not depend on the marginal distributions. Psychometrika 73: 777–789MATHCrossRefGoogle Scholar
  48. Warrens MJ (2008c) On the indeterminacy of resemblance measures for binary (presence/absence) data. J Class 25: 125–136CrossRefMathSciNetGoogle Scholar
  49. Warrens MJ (2008d) Bounds of resemblance measures for binary (presence/absence) variables. J Class 25: 195–208CrossRefMathSciNetGoogle Scholar
  50. Warrens MJ, Heiser WJ (2007) Robinson Cubes. In: Brito P, Bertrand P, Cucumel G, de Caravalho F (eds) Selected contributions in data analysis and classification. Springer, Heidelberg, pp 515–523CrossRefGoogle Scholar

Copyright information

© The Author(s) 2009

Authors and Affiliations

  1. 1.Unit Methodology and Statistics, Institute of PsychologyLeiden UniversityLeidenThe Netherlands

Personalised recommendations