Skip to main content
Log in

A Monte Carlo permutation test for co-occurrence data

  • Published:
Quality & Quantity Aims and scope Submit manuscript

Abstract

Researchers commonly use co-occurrence counts to assess the similarity of objects. This paper illustrates how traditional association measures can lead to misguided significance tests of co-occurrence in settings where the usual multinomial sampling assumptions do not hold. I propose a Monte Carlo permutation test that preserves the original distributions of the co-occurrence data. I illustrate the test on a dataset of organizational categorization, in which I investigate the relations between organizational categories (such as “Argentine restaurants” and “Steakhouses”).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. Obviously, the number of permutation needed depends on the sample size. Given that there are about 5,000 organizations with two or more categories, and given that most organizations are in two or three categories, 10,000 permutations are likely enough to arrive at random category associations.

References

  • Agresti, A.: A survey of exact inference for contingency tables. Stat. Sci. 7, 131–177 (1992)

    Article  Google Scholar 

  • Breiger, R.L.: The duality of persons and groups. Soc. Forces 53, 181–190 (1974)

    Google Scholar 

  • Dean, J., Henzinger, M.R.: Finding related pages in the World Wide Web. Comput. Netw. 31, 1467–1479 (1999)

    Article  Google Scholar 

  • Garfield, E.: Citation analysis as a tool in journal evaluation. Science 178, 471–479 (1972)

    Article  Google Scholar 

  • Good, P.I.: Permutation, Parametric and Bootstrap Tests of Hypotheses. Springer, New York (2005)

    Google Scholar 

  • Hubert, L.J.: Combinatorial data analysis. Psychometrika 50, 449–467 (1985)

    Article  Google Scholar 

  • Manning, C.D., Schütze, H.: Foundations of Statistical Natural Language Processing. MIT Press, Cambridge, MA (1999)

    Google Scholar 

  • Pearson, K.: On a criterion that a given system of deviations from the probable in the case of correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling. Philos. Mag. 5, 157–175 (1900)

    Article  Google Scholar 

  • Wickens, T.D.: Multiway Contingency Tables Analysis for the Social Sciences. Lawrence Erlbaum Associates, Hillsdale (1989)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Balázs Kovács.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kovács, B. A Monte Carlo permutation test for co-occurrence data. Qual Quant 48, 955–960 (2014). https://doi.org/10.1007/s11135-012-9817-x

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11135-012-9817-x

Keywords

Navigation