Skip to main content
Log in

Categorising Count Data into Ordinal Responses with Application to Ecological Communities

  • Published:
Journal of Agricultural, Biological, and Environmental Statistics Aims and scope Submit manuscript

Abstract

Count data sets may involve overdispersion from a set of species and underdispersion from another set which would require fitting different models (e.g. a negative binomial model for the overdispersed set and a binomial model for the underdispersed one). Additionally, many count data sets have very high counts and very low counts. Categorising these counts into ordinal categories makes the actual counts less influential in the model fitting, giving broad categories which enable us to detect major broadly based patterns of turnover or nestedness shown by groups of species. In this paper, a strategy of categorising count data into ordinal data was carried out and also we implemented measures to compare different cluster structures. The application of this categorising strategy and a comparison of clustering results between count and categorised ordinal data in two ecological community data sets are shown. A major advantage of using our ordinal approach is that it allows for the inclusion of all different levels of dispersion in the data in one methodology, without treating the data differently. This reduction of the parameters on modelling different levels of dispersion does not substantially change the results in clustering structure. In the two data sets used in this paper, we observed ordinal clustering structure up to 93.1 % similar to those from the count data approaches. This has the important implication of supporting simpler, faster data collection using ordinal scales only.

Supplementary materials accompanying this paper appear on-line.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  • Agresti, A. Analysis of Ordinal Categorical Data. Wiley Series in Probability and Statistics. Wiley, 2nd edition, 2010.

  • Akaike, H. Information theory and an extension of the maximum likelihood principle. In Petrov, B. N. and Csaki, F., editors, 2nd International Symposium on Information Theory, pages 267–281, 1973.

  • Anderson, J. A. Regression and ordered categorical variables. Journal of the Royal Statistical Society Series B, 46(1):1–30, 1984.

  • Dempster, A. P., Laird, N. M., and Rubin, D. B. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B, 39(1):1–38, 1977.

  • Dolédec, S., Chessel, D., Ter Braak, C. S. J., and Champely, S. Matching species traits to environmental variables: a new three-table ordination method. Environmental and Ecological Statistics, 3(2):143–166, 1996.

  • Fernández, D., Arnold, R., and Pledger, S. Mixture-based clustering for the ordered stereotype model. Computational Statistics and Data Analysis, 2014a. URL http://www.sciencedirect.com/science/article/pii/S016794731400317X.

  • Fernández, D., Pledger, S., and Arnold, R. Introducing spaced mosaic plots. Research Report Series. ISSN: 1174-2011. 14-3, School of Mathematics, Statistics and Operations Research, VUW, 2014b. URL http://msor.victoria.ac.nz/foswiki/pub/Main/ResearchReportSeries/TechReport_Spaced_Mosaic_Plots.

  • Hubert, L. and Arabie, P. Comparing partitions. Journal of Classification, 2(1):193–218, 1985.

    Article  MATH  Google Scholar 

  • Hui, F. K. C., Taskinen, S., Pledger, S., Foster, S. D., and Warton, D. I. Model-based approaches to unconstrained ordination. Methods in Ecology and Evolution, 2014.

  • Hyndman, R. J. and Fan, Y. Sample quantiles in statistical packages. Statistical Computing, 50(4):361–365, 1996.

  • Kraskov, A., Stögbauer, H., Andrzejak, R. G., and Grassberger, P. Hierarchical clustering using mutual information. EPL (Europhysics Letters), 70(2):278–284, 2005.

  • McLachlan, G. J. and Krishnan, T. The EM Algorithm and Extensions. Wiley Series in Probability and Statistics: Applied Probability and Statistics. John Wiley, 1997.

  • Meila, M. Comparing clusterings: an axiomatic view. In ICML 2005: Proceedings of the 22nd International Conference on Machine Learning, pages 577–584. ACM Press, 2005.

  • Pledger, S. and Arnold, R. Multivariate methods using mixtures: Correspondence analysis, scaling and pattern-detection. Computational Statistics and Data Analysis, 71:241–261, 2014.

  • Rand, W. M. Objective criteria for the evaluation of clustering methods. Journal of the American Statistical Association, 66(336):846–850, 1971.

  • Rogers, A. Statistical Analysis of Spatial Dispersion: The Quadrat Method. Monographs in Spatial and Environmental Systems Analysis. Pion, 1974.

  • Van der Aart, P. and Smeenk-Enserink, N. Correlations between distributions of hunting spiders (Lycosidae, Ctenidae) and environmental characteristics in a dune area. Netherlands Journal of Zoology, 25(1):1–45, 1974.

  • Vinh, N. X., Epps, J., and Bailey, J. Information theoretic measures for clusterings comparison: Variants, properties, normalization and correction for chance. Journal of Machine Learning Research, 11(1):2837–2854, 2010.

  • Warton, D. I., Wright, S. T., and Wang, Y. Distance-based multivariate analyses confound location and dispersion effects. Methods in Ecology and Evolution, 3(1):89–101, 2012.

  • Žiberna, A., Kejžar, N., and Golob, P. A comparison of different approaches to hierarchical clustering of ordinal data. Metodološki Zvezki - Advances in Methodology and Statistics, 1(1):57–73, 2004.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to D. Fernández.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 529 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fernández, D., Pledger, S. Categorising Count Data into Ordinal Responses with Application to Ecological Communities. JABES 21, 348–362 (2016). https://doi.org/10.1007/s13253-015-0240-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13253-015-0240-3

Keywords

Navigation