, Volume 29, Issue 2, pp 199-226

A New Class of Weighted Similarity Indices Using Polytomous Variables

Rent the article at a discount

Rent now

* Final gross prices may vary according to local VAT.

Get Access

Abstract

We introduce new similarity measures between two subjects, with reference to variables with multiple categories. In contrast to traditionally used similarity indices, they also take into account the frequency of the categories of each attribute in the sample. This feature is useful when dealing with rare categories, since it makes sense to differently evaluate the pairwise presence of a rare category from the pairwise presence of a widespread one. A weighting criterion for each category derived from Shannon’s information theory is suggested. There are two versions of the weighted index: one for independent categorical variables and one for dependent variables. The suitability of the proposed indices is shown in this paper using both simulated and real world data sets.

The authors thank the Editor and the anonymous referees for their comments and suggestions. We feel that the paper has substantially improved due to their helpful feedback.